Apr
18
2019
--

MySQL-python: Adding caching_sha2_password and TLSv1.2 Support

python not connecting to MySQL

python not connecting to MySQLPython 2 reaches EOL on 2020-01-01 and one of its commonly used third-party packages is MySQL-python. If you have not yet migrated away from both of these, since MySQL-python does not support Python 3, then you may have come across some issues if you are using more recent versions of MySQL and are enforcing a secured installation. This post will look at two specific issues that you may come across (caching_sha2_password in MySQL 8.0 and TLSv1.2 with MySQL >=5.7.10 when using OpenSSL) that will prevent you from connecting to a MySQL database and buy you a little extra time to migrate code.

For CentOS 7, MySQL-python is built against the client library provided by the mysql-devel package, which does not support some of the newer features, such as caching_sha2_password (the new default authentication plugin as of MySQL 8.0.4) and TLSv1.2. This means that you cannot take advantage of these security features and therefore leave yourself with an increased level of vulnerability.

So, what can we do about this? Thankfully, it is very simple to resolve, so long as you don’t mind getting your hands dirty!

Help! MySQL-python won’t connect to my MySQL 8.0 instance

First of all, let’s take a look at the issues that the latest version of MySQL-python (MySQL-python-1.2.5-1.el7.rpm for CentOS 7) has when connecting to a MySQL 8.0 instance. We will use a Docker container to help test this out by installing MySQL-python along with the Percona Server 8.0 client so that we can compare the two.

=> docker run --rm -it --network host --name rpm-build centos:latest
# yum install -y -q MySQL-python
warning: /var/cache/yum/x86_64/7/base/packages/MySQL-python-1.2.5-1.el7.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID f4a80eb5: NOKEY
Public key for MySQL-python-1.2.5-1.el7.x86_64.rpm is not installed
Importing GPG key 0xF4A80EB5:
 Userid     : "CentOS-7 Key (CentOS 7 Official Signing Key) <security@centos.org>"
 Fingerprint: 6341 ab27 53d7 8a78 a7c2 7bb1 24c6 a8a7 f4a8 0eb5
 Package    : centos-release-7-6.1810.2.el7.centos.x86_64 (@CentOS)
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
# yum install -q -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm; percona-release setup ps80
* Enabling the Percona Original repository
<*> All done!
The percona-release package now contains a percona-release script that can enable additional repositories for our newer products.
For example, to enable the Percona Server 8.0 repository use:
  percona-release setup ps80
Note: To avoid conflicts with older product versions, the percona-release setup command may disable our original repository for some products.
For more information, please visit:
  https://www.percona.com/doc/percona-repo-config/percona-release.html
* Disabling all Percona Repositories
* Enabling the Percona Server 8.0 repository
* Enabling the Percona Tools repository
<*> All done!
# yum install -q -y percona-server-client
warning: /var/cache/yum/x86_64/7/ps-80-release-x86_64/packages/percona-server-shared-8.0.15-5.1.el7.x86_64.rpm: Header V4 RSA/SHA256 Signature, key ID 8507efa5: NOKEY
Public key for percona-server-shared-8.0.15-5.1.el7.x86_64.rpm is not installed
Importing GPG key 0x8507EFA5:
 Userid     : "Percona MySQL Development Team (Packaging key) <mysql-dev@percona.com>"
 Fingerprint: 4d1b b29d 63d9 8e42 2b21 13b1 9334 a25f 8507 efa5
 Package    : percona-release-1.0-11.noarch (installed)
 From       : /etc/pki/rpm-gpg/PERCONA-PACKAGING-KEY

Next we will create a client config to save some typing later on and then check that we can connect to the MySQL instance that is already running; if you don’t have MySQL running on your local machine then you can install it in the container.

# cat <<EOS > /root/.my.cnf
> [client]
> user = testme
> password = my-secret-pw
> host = 127.0.0.1
> protocol = tcp
> ssl-mode = REQUIRED
> EOS
mysql> /* hide passwords */ PAGER sed "s/AS '.*' REQUIRE/AS 'xxx' REQUIRE/" ;
PAGER set to 'sed "s/AS '.*' REQUIRE/AS 'xxx' REQUIRE/"'
mysql> SHOW CREATE USER CURRENT_USER();
CREATE USER 'testme'@'%' IDENTIFIED WITH 'caching_sha2_password' AS 'xxx' REQUIRE SSL PASSWORD EXPIRE DEFAULT ACCOUNT UNLOCK PASSWORD HISTORY DEFAULT PASSWORD REUSE INTERVAL DEFAULT
mysql> SELECT @@version, @@version_comment, ssl_version, ssl_cipher, user FROM sys.session_ssl_status INNER JOIN sys.processlist ON thread_id = thd_id AND conn_id = CONNECTION_ID();
8.0.12-1        Percona Server (GPL), Release '1', Revision 'b072566'   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256     testme@localhost
All looks good here, so now we will check that MySQLdb can also connect:
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib64/python2.7/site-packages/MySQLdb/__init__.py", line 81, in Connect
    return Connection(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 193, in __init__
    super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2059, "Authentication plugin 'caching_sha2_password' cannot be loaded: /usr/lib64/mysql/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory")

Changing the user’s authentication plugin

We have hit the first issue, because MySQL 8.0 introduced the caching_sha2_password plugin and made it the default authentication plugin, so we can’t connect at all. However, we can gain access by changing the grants for the user to use the old default plugin and then test again.

mysql> ALTER USER 'testme'@'%' IDENTIFIED WITH mysql_native_password BY "my-secret-pw";
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib64/python2.7/site-packages/MySQLdb/__init__.py", line 81, in Connect
    return Connection(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 193, in __init__
    super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (1045, "Access denied for user 'testme'@'localhost' (using password: YES)")

Configuring SSL options

We still can’t connect, so what can be wrong? Well, we haven’t added any SSL details to the config other than specifying that we need to use SSL, so we will add the necessary options and make sure that we can connect.

# cat <<EOS >> /root/.my.cnf
> ssl-ca = /root/certs/ca.pem
> ssl-cert = /root/certs/client-cert.pem
> ssl-key = /root/certs/client-key.pem
> EOS
# mysql -Bse "select 1"
1
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
(('8.0.12-1', "Percona Server (GPL), Release '1', Revision 'b072566'", 'TLSv1', 'ECDHE-RSA-AES256-SHA', 'testme@localhost'),)

Forcing TLSv1.2 or later to further secure connections

That looks much better now, but if you look closely then you will notice that the MySQLdb connection is using TLSv1, which will make your security team either sad, angry or perhaps both as the connection can be downgraded! We want to secure the installation and keep data over the wire safe from prying eyes, so we will remove TLSv1 and also TLSv1.1 from the list of versions accepted by MySQL and leave only TLSv1.2 (TLSv1.3 is not supported with our current MySQL 8.0 and OpenSSL versions). Any guesses what will happen now?

mysql> SET PERSIST_ONLY tls_version = "TLSv1.2"; RESTART;
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib64/python2.7/site-packages/MySQLdb/__init__.py", line 81, in Connect
    return Connection(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 193, in __init__
    super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2026, 'SSL connection error: error:00000001:lib(0):func(0):reason(1)')

MySQLdb can no longer connect to MySQL, sadly with a rather cryptic message! However, as we made the change that triggered this we don’t need to decipher it and can now start looking to add TLSv1.2 support to MySQLdb, so roll up your sleeves!

Solution: Build a new RPM

In order to build a new RPM we will need to do a little extra work in the container first of all, but it doesn’t take long and is pretty simple to do. We are going to install the necessary packages to create a basic build environment and then rebuild the MySQL-python RPM from its current source RPM.

## Download packages
# yum install -q -y rpm-build yum-utils gnupg2 rsync deltarpm gcc
Package yum-utils-1.1.31-50.el7.noarch already installed and latest version
Package gnupg2-2.0.22-5.el7_5.x86_64 already installed and latest version
Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
## Create build directory tree
# install -d /usr/local/builds/rpmbuild/{BUILD,RPMS,SOURCES,SPECS,SRPMS}
## Configure the RPM macro
# echo "%_topdir /usr/local/builds/rpmbuild" > ~/.rpmmacros
## Switch to a temporary directory to ease cleanup
# cd "$(mktemp -d)"; pwd
/tmp/tmp.xqxdBeLkXs
## Download the source RPM
# yumdownloader --source -q MySQL-python
Enabling updates-source repository
Enabling base-source repository
Enabling extras-source repository
## Extract the source RPM
# rpm2cpio "$(ls -1 MySQL-python*src.rpm)" | cpio -idmv
MySQL-python-1.2.5.zip
MySQL-python-no-openssl.patch
MySQL-python.spec
234 blocks

We are now ready to start making some changes to the source code and build specifications, but first of all we need to take note of another change that occurred in MySQL 8.0. Older code will reference my_config.h, which has since been removed and is no longer required for building; the fix for this will be shown below.

## Adjust the spec file to use percona-server-devel and allow a build number
# sed -i "s/mysql-devel/percona-server-devel/g; s/Release: 1/Release: %{buildno}/" MySQL-python.spec
## Store the ZIP filename and extract
# SRC_ZIP="$(ls -1 MySQL-python*.zip)"; unzip "${SRC_ZIP}"
## Store the source directory and remove the include of my_config.h
# SRC_DIR=$(find . -maxdepth 1 -type d -name "MySQL-python*"); sed -i 's/#include "my_config.h"/#define NO_MY_CONFIG/' "${SRC_DIR}/_mysql.c"
## View our _mysql.c change
# fgrep -m1 -B3 -A1 -n NO_MY_CONFIG "${SRC_DIR}/_mysql.c"
41-#if defined(MS_WINDOWS)
42-#include
43-#else
44:#define NO_MY_CONFIG
45-#endif
## Update the source
# zip -uv "${SRC_ZIP}" "${SRC_DIR}/_mysql.c"
updating: MySQL-python-1.2.5/_mysql.c   (in=84707) (out=17314) (deflated 80%)
total bytes=330794, compressed=99180 -> 70% savings

Now that we have adjusted the source code and specification we can start work on the new package so that we can once again connect to MySQL.

## Sync the source to the build tree
# rsync -ac ./ /usr/local/builds/rpmbuild/SOURCES/
## Copy the new specification file to the build tree
# cp -a MySQL-python.spec /usr/local/builds/rpmbuild/SPECS/
## Build a new source RPM with our updates
# rpmbuild --define "buildno 2" -bs /usr/local/builds/rpmbuild/SPECS/MySQL-python.spec
Wrote: /usr/local/builds/rpmbuild/SRPMS/MySQL-python-1.2.5-2.el7.src.rpm
## Use the new source RPM to install any missing dependencies
# yum-builddep -q -y /usr/local/builds/rpmbuild/SRPMS/MySQL-python-1.2.5-2.el7.src.rpm &>>debug.log
## Build a new RPM
# rpmbuild --define "buildno 2" --rebuild /usr/local/builds/rpmbuild/SRPMS/MySQL-python-1.2.5-2.el7.src.rpm &>>debug.log
# tail -n1 debug.log
+ exit 0

All went well and so we can now install the new package and see if it worked.

## Install the local RPM
# yum localinstall -q -y /usr/local/builds/rpmbuild/RPMS/x86_64/MySQL-python-1.2.5-2.el7.x86_64.rpm
## Check to see which libmysqlclient is used
# ldd /usr/lib64/python2.7/site-packages/_mysql.so | fgrep libmysqlclient
        libmysqlclient.so.21 => /usr/lib64/mysql/libmysqlclient.so.21 (0x00007f61660be000)
## Test the connection
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
(('8.0.12-1', "Percona Server (GPL), Release '1', Revision 'b072566'", 'TLSv1.2', 'ECDHE-RSA-AES128-GCM-SHA256', 'testme@localhost'),)

Almost there… now force user authentication with the caching_sha2_password plugin

Hurrah! We can once again connect to the database and this time we are using TLSv1.2 and have thus increased our security. There is one thing left to do now though. Earlier on we needed to change the authentication plugin, so we will now change it back for extra security and see if all is still well.

mysql> ALTER USER 'testme'@'%' IDENTIFIED WITH caching_sha2_password BY "my-secret-pw";
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
(('8.0.12-1', "Percona Server (GPL), Release '1', Revision 'b072566'", 'TLSv1.2', 'ECDHE-RSA-AES128-GCM-SHA256', 'testme@localhost'),)

Mission successful! Hopefully, if you are finding yourself in need of a little extra time migrating away from MySQLdb then this will help.


Image modified from photo by David Clode on Unsplash

Feb
11
2019
--

Google Docs gets an API for task automation

Google today announced the general availability of a new API for Google Docs that will allow developers to automate many of the tasks that users typically do manually in the company’s online office suite. The API has been in developer preview since last April’s Google Cloud Next 2018 and is now available to all developers.

As Google notes, the REST API was designed to help developers build workflow automation services for their users, build content management services and create documents in bulk. Using the API, developers can also set up processes that manipulate documents after the fact to update them, and the API also features the ability to insert, delete, move, merge and format text, insert inline images and work with lists, among other things.

The canonical use case here is invoicing, where you need to regularly create similar documents with ever-changing order numbers and line items based on information from third-party systems (or maybe even just a Google Sheet). Google also notes that the API’s import/export abilities allow you to use Docs for internal content management systems.

Some of the companies that built solutions based on the new API during the preview period include Zapier, Netflix, Mailchimp and Final Draft. Zapier integrated the Docs API into its own workflow automation tool to help its users create offer letters based on a template, for example, while Netflix used it to build an internal tool that helps its engineers gather data and automate its documentation workflow.

 

 

Jun
22
2018
--

Security, privacy experts weigh in on the ICE doxxing

In what appears to be the latest salvo in a new, wired form of protest, developer Sam Lavigne posted code that scrapes LinkedIn to find Immigration and Customs Enforcement employee accounts. His code, which basically a Python-based tool that scans LinkedIn for keywords, is gone from Github and Gitlab and Medium took down his original post. The CSV of the data is still available here and here and WikiLeaks has posted a mirror.

“I find it helpful to remember that as much as internet companies use data to spy on and exploit their users, we can at times reverse the story, and leverage those very same online platforms as a means to investigate or even undermine entrenched power structures. It’s a strange side effect of our reliance on private companies and semi-public platforms to mediate nearly all aspects of our lives. We don’t necessarily need to wait for the next Snowden-style revelation to scrutinize the powerful — so much is already hiding in plain sight,” said Lavigne.

Doxxing is the process of using publicly available information to target someone online for abuse. Because we can now find out anything on anyone for a few dollars – a search for “background check” brings up dozens of paid services that can get you names and addresses in a second – scraping public data on LinkedIn seems far easier and innocuous. That doesn’t make it legal.

“Recent efforts to outlaw doxxing at the national level (like the Online Safety Modernization Act of 2017) have stalled in committee, so it’s not strictly illegal,” said James Slaby, Security Expert at Acronis. “But LinkedIn and other social networks usually consider it a violation of their terms of service to scrape their data for personal use. The question of fairness is trickier: doxxing is often justified as a rare tool that the powerless can use against the powerful to call attention to perceived injustices.”

“The problem is that doxxing is a crude tool. The torrent of online ridicule, abuse and threats that can be heaped on doxxed targets by their political or ideological opponents can also rain down on unintended and undeserving targets: family members, friends, people with similar names or appearances,” he said.

The tool itself isn’t to blame. No one would fault a job seeker or salesperson who scraped LinkedIn for targeted employees of a specific company. That said, scraping and publicly shaming employees walks a thin line.

“In my opinion, the professor who developed this scraper tool isn’t breaking the law, as it’s perfectly legal to search the web for publicly available information,” said David Kennedy, CEO of TrustedSec. “This is known in the security space as ‘open source intelligence’ collection, and scrapers are just one way to do it. That said, it is concerning to see ICE agents doxxed in this way. I understand emotions are running high on both sides of this debate, but we don’t want to increase the physical security risks to our law enforcement officers.”

“The decision by Twitter, Github and Medium to block the dissemination of this information and tracking tool makes sense – in fact, law enforcement agents’ personal information is often protected. This isn’t going to go away anytime soon, it’s only going to become more aggressive, particularly as more people grow comfortable with using the darknet and the many available hacking tools for sale in these underground forums. Law enforcement agents need to take note of this, and be much more careful about what (and how often) they post online.”

Ultimately, doxxing is problematic. Because we place our information on public forums there should be nothing to stop anyone from finding and posting it. However, the expectation that people will use our information for good and not evil is swiftly eroding. Today, wrote one security researcher, David Kavanaugh, doxxing is becoming dangerous.

“Going after the people on the ground is like shooting the messenger. Decisions are made by leadership and those are the people we should be going after. Doxxing is akin to a personal attack. Change policy, don’t ruin more lives,” he said.

Mar
25
2015
--

Yelp IT! A talk with 3 Yelp MySQL DBAs on Percona Live & more

elp IT! A talk with 3 Yelp MySQL DBAs heading to Percona Live 2015Founded in 2004 to help people find great local businesses, Yelp has some 135 million monthly unique visitors. With those traffic volumes Yelp’s 300+ engineers are constantly working to keep things moving smoothly – and when you move that fast you learn many things.

Fortunately for the global MySQL community, three Yelp DBAs will be sharing what they’ve learned at the annual Percona Live MySQL Conference and Expo this April 13-16 in Santa Clara, California.

Say “hello” to Susanne Lehmann, Jenni Snyder and Josh Snyder! I chatted with them over email about their presentations, on how MySQL is used at Yelp, and about the shortage of women in MySQL.

***

Tom: Jenni, you and Josh will be co-presenting “Next generation monitoring: moving beyond Nagios ” on April 14.

You mentioned that Yelp’s databases scale dynamically, and so does your monitoring of those databases. And to minimize human intervention, you’ve created a Puppet and Sensu monitoring ensemble… because “if it’s not monitored, it’s not in production.” Talk to me more about Yelp’s philosophy of “opt-out monitoring.” What does that entail? How does that help Yelp?

Jenni: Before we moved to Sensu, our Nagios dashboards were a sea of red, muted, acknowledged, or disabled service checks. In fact, we even had a cluster check to make sure that we never accidentally put a host into use that was muted or marked for downtime. It was possible for a well-meaning operator to acknowledge checks on a host and forget about it, and I certainly perpetrated a couple of instances of disks filling up after acknowledging a 3am “warning” page that I’d rather forget about. With Sensu, hosts and services come out of the downtime/acknowledgement state automatically after a number of days, ensuring that we’re kept honest and stay on top of issues that need to be addressed.

Also, monitoring is deployed with a node, not separate monitoring configuration. Outside of a grace period we employ when a host is first provisioned or rebooted, if a host is up, it’s being monitored and alerting. Also, alerting doesn’t always mean paging. We also use IRC and file tickets directly into our tracking system when we don’t need eyes on a problem right away.


Tom: Susanne, in your presentation, titled “insert cassandra into prod where use_case=?;” you’ll discuss the situations you’ve encountered where MySQL just wasn’t the right tool for the job.

What led up to that discovery and how did you come up with finding the right tools (and what were they) to run alongside and support MySQL?

Susanne: Our main force behind exploring other datastores alongside MySQL was that Yelp is growing outside the US market a lot. Therefore we wanted the data to be nearer to the customer and needed multi-master writes.

Also, we saw use cases where our application data was organized very key-value like and not relational, which made them a better fit for a NoSQL solution.

We decided to use Cassandra as a datastore and I plan to go more into detail why during my talk. Now we offer developers more choices on how to store our application data, but we also believe in the “right tool for the job” philosophy and might add more solutions to the mix in the future.


Tom: Jenni, you’ll also be presenting “Schema changes multiple times a day? OK!” I know that you and your fellow MySQL DBAs are always improving and also finding better ways of supporting new and existing features for Yelp users like me. Delivering on such a scale must entail some unique processes and tools. Does this involve a particular mindset among your fellow DBAs? Also, what are some of those key tools – and processes and how are they used?

Jenni: Yelp prizes the productivity of our developers and our ability to iterate and develop new features quickly. In order to do that, we need to be able to not only create new database tables, but also modify existing ones, many of which are larger than MySQL can alter without causing considerable replication delay. The first step is to foster a culture of automated testing, monitoring, code reviews, and partnership between developers and DBAs to ensure that we can quickly & safely roll out schema changes. In my talk, I’ll be describing tools that we’ve talked about before, like our Gross Query Checker, as well as the way the DBA team works with developers while still getting the rest of our work done. The second, easy part is using a tool like pt-online-schema-change to run schema changes online without causing replication delay or degrading performance :)


Tom: Josh, you’ll also be speaking on “Bootstrapping databases in a single command: elastic provisioning for the win.” What is “elastic provisioning” and how are you using it for Yelp’s tooling?

Josh: When I say that we use elastic provisioning, I mean that we can reliably and consistently build a database server from scratch, with minimal human involvement. The goal is to encompass every aspect of the provisioning task, including configuration, monitoring, and even load balancing, in a single thoroughly automated process. With this process in place, we’ve found ourselves able to quickly allocate and reallocate resources, both in our datacenters and in the cloud. Our tools for implementing the above goals give us greater confidence in our infrastructure, while avoiding single-points of failure and achieving the maximum possible level of performance. We had a lot of fun building this system, and we think that many of the components involved are relevant to others in the field.


Tom: Susanne and Jenni, last year at Percona Live there was a BoF session titled “MySQL and Women (or where are all the women?).” The idea was to discuss why there are “just not enough women working on the technology side of tech.” In a nutshell, the conversation focused on why there are not more women in MySQL and why so relatively few attend MySQL conferences like Percona Live.

The relative scarcity of women in technical roles was also the subject of an article published in the August 2014 issue of Forbes, citing a recent industry report.

Why, in your (respective) views, do you (or don’t) think that there are so few women in MySQL? And how can this trend be reversed?

Susanne: I think there are few women in MySQL and the reasons are manifold. Of course there is the pipeline problem. Then there is the problem, widely discussed right now, that women who are entering STEM jobs are less likely staying in there. These are reasons not specific for MySQL jobs, but rather for STEM in general. What is more specific for database/MySQL jobs is, in my opinion, that often times DBAs need to be on call, they need to stay in the office if things go sideways. Database problems tend often to be problems that can’t wait till the next morning. That makes it more demanding when you have a family for example (which is true for men as well of course, but seems still to be more of a problem for women).

As for how to reverse the trend, I liked this Guardian article because it covers a lot of important points. There is no easy solution.

I like that more industry leaders and technology companies are discussing what they can do to improve diversity these days. In general, it really helps to have a great professional (female) support system. At Yelp, we have AWE, the Awesome Women in Engineering group, in which Jenni and I are both active. We participate in welcoming women to Yelp engineering, speaking at external events and workshops to help other women present their work, mentoring, and a book club.

Jenni: I’m sorry that I missed Percona Live and this BoF last year; I was out on maternity leave. I believe that tech/startup culture is a huge reason that fewer women are entering and staying these days, but a quick web search will lead you to any number of articles debating the subject. I run into quite a few women working with MySQL; it’s large, open community and generally collaborative and supportive nature is very welcoming. As the article you linked to suggests, MySQL has a broad audience. It’s easy to get started with and pull into any project, and as a result, most software professionals have worked with it at some time or another.

On another note, I’m happy to see that Percona Live has a Code of Conduct. I hope that Percona and/or MySQL will consider adopting a Community Code of Conduct like Python, Puppet, and Ubuntu. Doing so raises the bar for all participants, without hampering collaboration and creativity!

* * *

Thanks very much, Susanne, Jenni and Josh! I look forward to seeing you next month at the conference. And readers, if you’d like to attend Percona Live, use the promo code Yelp15 for 15% off! Just enter that during registration. If you’re already attending, be sure to tweet about your favorite sessions using the hashtag #PerconaLive. And if you need to find a great place to eat while attending Percona Live, click here for excellent Yelp recommendations. :)

The post Yelp IT! A talk with 3 Yelp MySQL DBAs on Percona Live & more appeared first on MySQL Performance Blog.

Jan
07
2015
--

Django with time zone support and MySQL

Django This is yet another story of Django web-framework with time zone support and pain dealing with python datetimes and MySQL on the backend. In other words, offset-naive vs offset-aware datetimes.

Shortly, more about the problem. After reading the official documentation about the time zones, it makes clear that in order to reflect python datetime in the necessary time zone you need to make it tz-aware first and than show in that time zone.

Here is the first issue: tz-aware in what time zone? MySQL stores timestamps in UTC and converts for storage/retrieval from/to the current time zone. By default, the current time zone is the server’s time, can be changed on MySQL globally, per connection etc. So it becomes not obvious what was tz of the value initially before stored in UTC. If you change server or session tz further, it will lead to more mess. Unlike MySQL, PostgreSQL has timestamp with time zone data type, so Django can auto-detect tz and make datetimes tz-aware automatically featuring tzinfo attribute.

There are many solutions on the web… for example an extension to detect tz on UI by Javascript and pass back to the backend allowing you to work with tz-aware data, however, I need something simpler but mainly with less changes to the existing code base.

Here is my case. The server and MySQL are on UTC. That’s cool and it cuts off the first barrier. I store python datetimes in MySQL timespamp columns also in UTC per database time. Anyway, it is a best practice to store time in UTC. I have some read-only pages on the web app and want to show datetimes according to user’s tz. Looks to be a simple task but dealing with MySQL on backend, all my data coming from models have a naive datetime type assigned. So I need to find a way to easily make all my DateTimeField fields UTC-aware (add tzinfo attribute) and some convenient method of showing datetimes in user’s tz still having an access to UTC or naive datetimes for calculation on the backned. Therefore, I will be still doing all the calculation in UTC and showing TZ-aware values to users only on UI side.

This is an example of middleware that gets user’s tz from the database and sets in the session, so it can be retrieved anywhere using get_current_timezone():

from django.utils.timezone import activate
from myapp.models import UserInfo
class TimezoneMiddleware(object):
    """Middleware that is run on each request before the view is executed.
    Activate user's timezone for further retrieval by get_current_timezone() or
    creating tz-aware datetime objects beforehand.
    """
    def process_request(self, request):
        session_tz = request.session.get('timezone')
        # If tz has been already set in session, let's activate it
        # and avoid SQL query to retrieve it on each request.
        if session_tz:
            activate(session_tz)
        else:
            try:
                # Get user's tz from the database.
                uinfo = UserInfo.objects.get(user_id=request.user.id, user_id__isnull=False)
                if uinfo.timezone:
                    # If tz is configured by user, let's set it for the session.
                    request.session['timezone'] = uinfo.timezone
                    activate(uinfo.timezone)
            except UserInfo.DoesNotExist:
                pass

This is an excerpt from models.py:

from django.db import models
from django.utils.timezone import get_current_timezone, make_aware, utc
def localize_datetime(dtime):
    """Makes DateTimeField value UTC-aware and returns datetime string localized
    in user's timezone in ISO format.
    """
    tz_aware = make_aware(dtime, utc).astimezone(get_current_timezone())
    return datetime.datetime.strftime(tz_aware, '%Y-%m-%d %H:%M:%S')
class Messages(models.Model):
    id = models.AutoField(primary_key=True)
    body = models.CharField(max_length=160L)
    created = models.DateTimeField(auto_now_add=True)
    @property
    def created_tz(self):
        return localize_datetime(self.created)
...

“Messages” model has “created” field (timestamp in MySQL) and a property “created_tz”. That property reflects “created” in user’s tz using the function localize_datetime() which makes naive datetimes tz(UTC)-aware, converts into user’s tz set on the session level and returns a string in ISO format. In my case, I don’t prefer the default RFC format that includes +00:00 tz portion of datetime with tzinfo attribute or even need tz-aware datetimes to operate with. Same way I can have similar properties in all needed models knowing they can be accessed by the same name with “_tz” suffix.

Taking into account the above, I reference “created” for calculations in views or controllers and “created_tz” in templaetes or for JSON-output.  This way I don’t need to change all references of “created” to something like “make_aware(created, utc)” or datetime.datetime.utcnow() to datetime.datetime.utcnow().replace(tzinfo=pytz.utc) across the code. The code changes in my app will be minimal by introducing a custom property in the model and continue operating with UTC on the raw level:

# views.py
# Operating UTC
msgs = Messages.objects.filter(created__gt=datetime.datetime.now() -
                               datetime.datetime.timedelta(hours=24))

<! -- HTML template -- >
{% for m in msgs %}
    {{ m.id }}. {{ m.body }} (added on {{ m.created_tz }})
{% endfor %}
* All times in user's tz.

I hope this article may help in your findings.
Happy New Year across all time zones!

The post Django with time zone support and MySQL appeared first on MySQL Performance Blog.

May
16
2014
--

Benchmark: SimpleHTTPServer vs pyclustercheck (twisted implementation)

Github user Adrianlzt provided a python-twisted alternative version of pyclustercheck per discussion on issue 7.

Due to sporadic performance issues noted with the original implementation in SimpleHTTPserver, the benchmarks which I’ve included as part of the project on github use mutli-mechanize library,

  • cache time 1 sec
  • 2 x 100 thread pools
  • 60s ramp up time
  • 600s total duration
  • testing simulated node fail (always returns 503, rechecks mysql node on cache expiry)
  • AMD FX(tm)-8350 Eight-Core Processor
  • Intel 330 SSD
  • local loop back test (127.0.0.1)

The SimpleHTTPServer instance faired as follows:

All_Transactions_throughput All_Transactions_response_times_intervals All_Transactions_response_times

Right away we can see around 500TPS throughput, however as can be seen in both response time graphs there are “outlying” transactions, something is causing the response time to spike dramatically  SimpleHTTPServer, how does the twisted alternative compare? (note the benchmarks are from the current HEAD with features re-added to avoid regression, caching and ipv6 support)

All_Transactions_throughput All_Transactions_response_times_intervals All_Transactions_response_times

 

Ouch! We appear to have taken a performance hit, at least in terms of TPS -19% rough estimates however compare the response time graphs to find a much more consistent plot, we had outliers hitting near  70s for SimpleHTTP server, we’re always under 1s within twisted.

Great! So why isn’t this merged into the master branch as the main project and therfor bundled with Percona XtraDB Cluster (PXC)? The issue here is the required version of python-twisted; ipv6 support was introduced in issue 8 by user Nibler999 and to avoid regression I re-added support for ipv6 in this commit for twisted

ipv6 support for python-twisted is not in the version distributed to main server OS’s such as

  • EL6 python-twisted 8.x
  • Ubuntu 10.04 LTS python-twisted 11.x

What’s the issue here? Attempting to bind / listen to an ipv6 interface yields the following error: twisted.internet.error.CannotListenError: Couldn't listen on :::8000: [Errno -9] Address family for hostname not supported.

Due to this regression (breaking of ipv6 support) the twisted version can not at this time be merged into master, the twisted version however as can be seen from above is much more consistent and if you have the “cycles” to implement it (e.g. install twisted from pypy via pip / easy_install to get >= 12.x) and test it’s a promising alternative.

To illustrate this further the benchmark was made more gruling:

  • 5 x 100 thread pools
  • 60s ramp up
  • 600s total duration

First the twisted results, note the initial spike is due to a local python issue where it locked up creating a new thread in multi-mechanize:

All_Transactions_response_times All_Transactions_response_times_intervals All_Transactions_throughput

Now the SimpleHTTPServer results:

All_Transactions_response_times All_Transactions_response_times_intervals All_Transactions_throughput

Oh dear, as the load increases clearly we get some stability issues inside SimpleHTTP server…

Also worth noting is the timeouts

  • twisted: grep 'timed out' results.csv | wc -l == 0
  • SimpleHTTPServer: grep 'timed out' results.csv | wc -l == 470

 

… in the case of increased load the twisted model performs far more consistently under the same test conditions when compared against SimpleHTTPServer. I include the multi-mechanize scripts as part of the project on GitHub – as such you can recreate these tests yourself and gauge the performance to see if twisted or SimpleHTTP suits your needs.

The post Benchmark: SimpleHTTPServer vs pyclustercheck (twisted implementation) appeared first on MySQL Performance Blog.

Jul
08
2013
--

Poll: What programming languages and platforms do you use?

What Programming Languages and Platforms do you use?What programming languages and platforms do you use for large-scale projects in your organization?

If something is missing from the list please leave a comment and share your story. Thanks!

Note: There is a poll embedded within this post, please visit the site to participate in this post’s poll.

The post Poll: What programming languages and platforms do you use? appeared first on MySQL Performance Blog.

Oct
04
2010
--

Balance Your Cloud

Seems like eons ago (just under 6 months..) when I joined Canonical, and hopped on a plane headed for Brussels and UDS-Maverick.

What a whirlwind, attending sessions, meeting the real rock stars of the Ubuntu world, and getting to know my super distributed team.

One of the sessions was based on a blueprint for load balancing in the cloud. The idea was that rather than rely on amazon’s Elastic Load Balancer, you could build your own solution that you could possibly even move around between UEC, EC2, or even Rackspace clouds.

Well it got a lower priority to some other stuff, so unfortunately, many parts got dropped (like ELB compatible cli tools).

But, I managed to find the time to create a proof of concept for managing haproxy’s config file (perhaps my first real python project), and write up a HOWTO for using it.

Honestly, its not the best HOWTO I’ve ever written. Its got a lot of stuff left out. But, it should be enough to get most admins past the “tinker for a few hours” phase and into the “tinker for 40 minutes right before getting it working then passing out on your keyboard at 4:00am” phase. I know thats how far it got me..


Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com