Aug
25
2021
--

Bodo.ai secures $14M, aims to make Python better at handling large-scale data

Bodo.ai, a parallel compute platform for data workloads, is developing a compiler to make Python portable and efficient across multiple hardware platforms. It announced Wednesday a $14 million Series A funding round led by Dell Technologies Capital.

Python is one of the top programming languages used among artificial intelligence and machine learning developers and data scientists, but as Behzad Nasre, co-founder and CEO of Bodo.ai, points out, it is challenging to use when handling large-scale data.

Bodo.ai, headquartered in San Francisco, was founded in 2019 by Nasre and Ehsan Totoni, CTO, to make Python higher performing and production ready. Nasre, who had a long career at Intel before starting Bodo, met Totoni and learned about the project that he was working on to democratize machine learning and enable parallel learning for everyone. Parallelization is the only way to extend Moore’s Law, Nasre told TechCrunch.

Bodo does this via a compiler technology that automates the parallelization so that data and ML developers don’t have to use new libraries, APIs or rewrite Python into other programming languages or graphics processing unit code to achieve scalability. Its technology is being used to make data analytics tools in real time and is being used across industries like financial, telecommunications, retail and manufacturing.

“For the AI revolution to happen, developers have to be able to write code in simple Python, and that high-performance capability will open new doors,” Totoni said. “Right now, they rely on specialists to rewrite them, and that is not efficient.”

Joining Dell in the round were Uncorrelated Ventures, Fusion Fund and Candou Ventures. Including the new funding, Bodo has raised $14 million in total. The company went after Series A dollars after its product had matured and there was good traction with customers, prompting Bodo to want to scale quicker, Nasre said.

Nasre feels Dell Technologies Capital was “uniquely positioned to help us in terms of reserves and the role they play in the enterprise at large, which is to have the most effective salesforce in enterprise.”

Though he was already familiar with Nasre, Daniel Docter, managing director at Dell Technologies, heard about Bodo from a data scientist friend who told Docter that Bodo’s preliminary results “were amazing.”

Much of Dell’s investments are in the early-stage and in deep tech founders that understand the problem. Docter puts Totoni and Nasre in that category.

“Ehsan fits this perfectly, he has super deep technology knowledge and went out specifically to solve the problem,” he added. “Behzad, being from Intel, saw and lived with the problem, especially seeing Hadoop fail and Spark take its place.”

Meanwhile, with the new funding, Nasre intends to triple the size of the team and invest in R&D to build and scale the company. It will also be developing a marketing and sales team.

The company is now shifting from financing to customer- and revenue-focused as it aims to drive up adoption by the Python community.

“Our technology can translate simple code into the fast code that the experts will try,” Totoni said. “I joined Intel Labs to work on the problem, and we think we have the first solution that will democratize machine learning for developers and data scientists. Now, they have to hand over Python code to specialists who rewrite it for tools. Bodo is a new type of compiler technology that democratizes AI.”

 

Apr
06
2021
--

Okta launches a new free developer plan

At its Oktane21 conference, Okta, the popular authentication and identity platform, today announced a new — and free — developer edition that features fewer limitations and support for significantly more monthly active users than its current free plan.

The new ‘Okta Starter Developer Edition,’ as it’s called, allows developers to scale up to 15,000 monthly active users — up from only 1,000 on its existing free plan. In addition, the company is also launching enhanced documentation, a set of sample apps and new SDKs, which now cover languages and frameworks like Go, Java, JavaScript, Python, Vue.js, React Native and Spring Boot.

“Our overall philosophy isn’t, ‘we want to just provide […] a set of authentication and authorization services.’ The way we’re looking at this is, ‘hey, app developer, how do we provide you the foundation you need to get up and running quickly with authorization and authentication as one part of it,’ ” Diya Jolly, Okta’s chief product officer, told me. And she believes that Okta is in a unique position to do so, because it doesn’t only offer tools to manage authorization and access, but also systems for securing microservices and providing applications with access to privileged resources.

Image Credits: Okta

It’s also worth noting that, while the deal hasn’t closed yet, Okta’s intent to acquire Auth0 significantly extends its developer strategy, given Auth0’s developer-first approach.

As for the expanded free account, Jolly noted that the company found that developers wanted to be able to access more of the service’s features during their prototyping phases. That means the new free Developer Edition comes with support for multi-factor authentication, machine-to-machine tokens and B2B integrations, for example, in addition to expanded support for integrations into toolchains. As is so often the case with enterprise tools, the free edition doesn’t come with the usual enterprise support options and has lower rate limits than the paid plans.

Still, and Jolly acknowledged this, a small to medium-sized business may be able to build applications and take them into production based on this new free plan.

“15K [monthly active users] is is a lot, but if you look at our customer base, it’s about the right amount for the smaller business applications, the real SMBs, and that was the goal. In a developer motion, you want people to try out things and then upgrade. I think that’s the key. No developer is going to come and build with you if you don’t have a free offering that they can tinker around and play with.”

Image Credits: Okta

She noted that the company has spent a lot of time thinking about how to support developers through the application development lifecycle overall. That includes better CLI tools for developers who would rather bypass Okta’s web-based console, for example, and additional integrations with tools like Terraform, Kong and Heroku. “Today, [developers] have to stitch together identity and Okta into those experiences — or they use some other identity — we’ve pre-stitched all of this for them,” Jolly said.

The new Okta Starter Developer Edition, as well as the new documentation, sample applications and integrations, are now available at developer.okta.com.

Dec
02
2020
--

Fylamynt raises $6.5M for its cloud workflow automation platform

Fylamynt, a new service that helps businesses automate their cloud workflows, today announced both the official launch of its platform as well as a $6.5 million seed round. The funding round was led by Google’s AI-focused Gradient Ventures fund. Mango Capital and Point72 Ventures also participated.

At first glance, the idea behind Fylamynt may sound familiar. Workflow automation has become a pretty competitive space, after all, and the service helps developers connect their various cloud tools to create repeatable workflows. We’re not talking about your standard IFTTT- or Zapier -like integrations between SaaS products, though. The focus of Fylamynt is squarely on building infrastructure workflows. While that may sound familiar, too, with tools like Ansible and Terraform automating a lot of that already, Fylamynt sits on top of those and integrates with them.

Image Credits: Fylamynt

“Some time ago, we used to do Bash and scripting — and then [ … ] came Chef and Puppet in 2006, 2007. SaltStack, as well. Then Terraform and Ansible,” Fylamynt co-founder and CEO Pradeep Padala told me. “They have all done an extremely good job of making it easier to simplify infrastructure operations so you don’t have to write low-level code. You can write a slightly higher-level language. We are not replacing that. What we are doing is connecting that code.”

So if you have a Terraform template, an Ansible playbook and maybe a Python script, you can now use Fylamynt to connect those. In the end, Fylamynt becomes the orchestration engine to run all of your infrastructure code — and then allows you to connect all of that to the likes of DataDog, Splunk, PagerDuty Slack and ServiceNow.

Image Credits: Fylamynt

The service currently connects to Terraform, Ansible, Datadog, Jira, Slack, Instance, CloudWatch, CloudFormation and your Kubernetes clusters. The company notes that some of the standard use cases for its service are automated remediation, governance and compliance, as well as cost and performance management.

The company is already working with a number of design partners, including Snowflake.

Fylamynt CEO Padala has quite a bit of experience in the infrastructure space. He co-founded ContainerX, an early container-management platform, which later sold to Cisco. Before starting ContainerX, he was at VMWare and DOCOMO Labs. His co-founders, VP of Engineering Xiaoyun Zhu and CTO David Lee, also have deep expertise in building out cloud infrastructure and operating it.

“If you look at any company — any company building a product — let’s say a SaaS product, and they want to run their operations, infrastructure operations very efficiently,” Padala said. “But there are always challenges. You need a lot of people, it takes time. So what is the bottleneck? If you ask that question and dig deeper, you’ll find that there is one bottleneck for automation: that’s code. Someone has to write code to automate. Everything revolves around that.”

Fylamynt aims to take the effort out of that by allowing developers to either write Python and JSON to automate their workflows (think “infrastructure as code” but for workflows) or to use Fylamynt’s visual no-code drag-and-drop tool. As Padala noted, this gives developers a lot of flexibility in how they want to use the service. If you never want to see the Fylamynt UI, you can go about your merry coding ways, but chances are the UI will allow you to get everything done as well.

One area the team is currently focusing on — and will use the new funding for — is building out its analytics capabilities that can help developers debug their workflows. The service already provides log and audit trails, but the plan is to expand its AI capabilities to also recommend the right workflows based on the alerts you are getting.

“The eventual goal is to help people automate any service and connect any code. That’s the holy grail. And AI is an enabler in that,” Padala said.

Gradient Ventures partner Muzzammil “MZ” Zaveri echoed this. “Fylamynt is at the intersection of applied AI and workflow automation,” he said. “We’re excited to support the Fylamynt team in this uniquely positioned product with a deep bench of integrations and a nonprescriptive builder approach. The vision of automating every part of a cloud workflow is just the beginning.”

The team, which now includes about 20 employees, plans to use the new round of funding, which closed in September, to focus on its R&D, build out its product and expand its go-to-market team. On the product side, that specifically means building more connectors.

The company offers both a free plan as well as enterprise pricing and its platform is now generally available.

Nov
12
2020
--

Databricks launches SQL Analytics

AI and data analytics company Databricks today announced the launch of SQL Analytics, a new service that makes it easier for data analysts to run their standard SQL queries directly on data lakes. And with that, enterprises can now easily connect their business intelligence tools like Tableau and Microsoft’s Power BI to these data repositories as well.

SQL Analytics will be available in public preview on November 18.

In many ways, SQL Analytics is the product Databricks has long been looking to build and that brings its concept of a “lake house” to life. It combines the performance of a data warehouse, where you store data after it has already been transformed and cleaned, with a data lake, where you store all of your data in its raw form. The data in the data lake, a concept that Databricks’ co-founder and CEO Ali Ghodsi has long championed, is typically only transformed when it gets used. That makes data lakes cheaper, but also a bit harder to handle for users.

Image Credits: Databricks

“We’ve been saying Unified Data Analytics, which means unify the data with the analytics. So data processing and analytics, those two should be merged. But no one picked that up,” Ghodsi told me. But “lake house” caught on as a term.

“Databricks has always offered data science, machine learning. We’ve talked about that for years. And with Spark, we provide the data processing capability. You can do [extract, transform, load]. That has always been possible. SQL Analytics enables you to now do the data warehousing workloads directly, and concretely, the business intelligence and reporting workloads, directly on the data lake.”

The general idea here is that with just one copy of the data, you can enable both traditional data analyst use cases (think BI) and the data science workloads (think AI) Databricks was already known for. Ideally, that makes both use cases cheaper and simpler.

The service sits on top of an optimized version of Databricks’ open-source Delta Lake storage layer to enable the service to quickly complete queries. In addition, Delta Lake also provides auto-scaling endpoints to keep the query latency consistent, even under high loads.

While data analysts can query these data sets directly, using standard SQL, the company also built a set of connectors to BI tools. Its BI partners include Tableau, Qlik, Looker and Thoughtspot, as well as ingest partners like Fivetran, Fishtown Analytics, Talend and Matillion.

Image Credits: Databricks

“Now more than ever, organizations need a data strategy that enables speed and agility to be adaptable,” said Francois Ajenstat, chief product officer at Tableau. “As organizations are rapidly moving their data to the cloud, we’re seeing growing interest in doing analytics on the data lake. The introduction of SQL Analytics delivers an entirely new experience for customers to tap into insights from massive volumes of data with the performance, reliability and scale they need.”

In a demo, Ghodsi showed me what the new SQL Analytics workspace looks like. It’s essentially a stripped-down version of the standard code-heavy experience with which Databricks users are familiar. Unsurprisingly, SQL Analytics provides a more graphical experience that focuses more on visualizations and not Python code.

While there are already some data analysts on the Databricks platform, this obviously opens up a large new market for the company — something that would surely bolster its plans for an IPO next year.

Oct
14
2020
--

Dataloop raises $11M Series A round for its AI data management platform

Dataloop, a Tel Aviv-based startup that specializes in helping businesses manage the entire data life cycle for their AI projects, including helping them annotate their data sets, today announced that it has now raised a total of $16 million. This includes a $5 seed round that was previously unreported, as well as an $11 million Series A round that recently closed.

The Series A round was led by Amiti Ventures, with participation from F2 Venture Capital, crowdfunding platform OurCrowd, NextLeap Ventures and SeedIL Ventures.

“Many organizations continue to struggle with moving their AI and ML projects into production as a result of data labeling limitations and a lack of real-time validation that can only be achieved with human input into the system,” said Dataloop CEO Eran Shlomo. “With this investment, we are committed, along with our partners, to overcoming these roadblocks and providing next generation data management tools that will transform the AI industry and meet the rising demand for innovation in global markets.”

Image Credits: Dataloop

For the most part, Dataloop specializes in helping businesses manage and annotate their visual data. It’s agnostic to the vertical its customers are in, but we’re talking about anything from robotics and drones to retail and autonomous driving.

The platform itself centers around the “humans in the loop” model that complements the automated systems, with the ability for humans to train and correct the model as needed. It combines the hosted annotation platform with a Python SDK and REST API for developers, as well as a serverless Functions-as-a-Service environment that runs on top of a Kubernetes cluster for automating dataflows.

Image Credits: Dataloop

The company was founded in 2017. It’ll use the new funding to grow its presence in the U.S. and European markets, something that’s pretty standard for Israeli startups, and build out its engineering team as well.

Apr
21
2020
--

Pulumi brings support for more languages to its infrastructure-as-code platform

Seattle-based Pulumi has quickly made a name for itself as a modern platform that lets developers specify their infrastructure through writing code in their preferred programming language — and not YAML. With the launch of Pulumi 2.0, those languages now include JavaScript, TypeScript, Go and .NET, in addition to its original support for Python. It’s also now extending its reach beyond its core infrastructure features to include deeper support for policy enforcement, testing and more.

As the company also today announced, it now has over 10,000 users and more than 100 paying customers. With that, it’s seeing a 10x increase in its year-over-year annual run rate, though without knowing the exact numbers, it’s obviously hard to know what exactly to make of that number. Current customers include the likes of Cockroach Labs, Mercedes-Benz and Tableau .

When the company first launched, its messaging was very much around containers and serverless. But as Pulumi founder and CEO Joe Duffy told me, today the company is often directly engaging with infrastructure teams that are building the platforms for the engineers in their respective companies.

As for Pulumi 2.0, Duffy says that “this is really taking the original Pulumi vision of infrastructure as code — using your favorite language — and augmenting it with what we’re calling superpowers.” That includes expanding the product’s overall capabilities from infrastructure provisioning to the adjacent problem spaces. That includes continuous delivery, but also policy-as-code. This extends the original Pulumi vision beyond just infrastructure but now also lets developers encapsulate their various infrastructure policies as code, as well.

Another area is testing. Because Pulumi allows developers to use “real” programming languages, they can also use the same testing techniques they are used to from the application development world to test the code they use to build their underlying infrastructure and catch mistakes before they go into production. And with all of that, developers can also use all of the usual tools they use to write code for defining the infrastructure that this code will then run on.

“The underlying philosophy is taking our heritage of using the best of what we know and love about programming languages — and really applying that to the entire spectrum of challenges people face when it comes to cloud infrastructure, from development to infrastructure teams to security engineers, really helping the entire organization be more productive working together,” said Duffy. “I think that’s the key: moving from infrastructure provisioning to something that works for the whole organization.”

Duffy also highlighted that many of the company’s larger enterprise users are relying on Pulumi to encode their own internal architectures as code and then roll them out across the company.

“We still embrace what makes each of the clouds special. AWS, Azure, Google Cloud and Kubernetes,” Duffy said. “We’re not trying to be a PaaS that abstracts over all. We’re just helping to be the consistent workflow across the entire team to help people adopt the modern approaches.”

Apr
18
2019
--

MySQL-python: Adding caching_sha2_password and TLSv1.2 Support

python not connecting to MySQL

python not connecting to MySQLPython 2 reaches EOL on 2020-01-01 and one of its commonly used third-party packages is MySQL-python. If you have not yet migrated away from both of these, since MySQL-python does not support Python 3, then you may have come across some issues if you are using more recent versions of MySQL and are enforcing a secured installation. This post will look at two specific issues that you may come across (caching_sha2_password in MySQL 8.0 and TLSv1.2 with MySQL >=5.7.10 when using OpenSSL) that will prevent you from connecting to a MySQL database and buy you a little extra time to migrate code.

For CentOS 7, MySQL-python is built against the client library provided by the mysql-devel package, which does not support some of the newer features, such as caching_sha2_password (the new default authentication plugin as of MySQL 8.0.4) and TLSv1.2. This means that you cannot take advantage of these security features and therefore leave yourself with an increased level of vulnerability.

So, what can we do about this? Thankfully, it is very simple to resolve, so long as you don’t mind getting your hands dirty!

Help! MySQL-python won’t connect to my MySQL 8.0 instance

First of all, let’s take a look at the issues that the latest version of MySQL-python (MySQL-python-1.2.5-1.el7.rpm for CentOS 7) has when connecting to a MySQL 8.0 instance. We will use a Docker container to help test this out by installing MySQL-python along with the Percona Server 8.0 client so that we can compare the two.

=> docker run --rm -it --network host --name rpm-build centos:latest
# yum install -y -q MySQL-python
warning: /var/cache/yum/x86_64/7/base/packages/MySQL-python-1.2.5-1.el7.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID f4a80eb5: NOKEY
Public key for MySQL-python-1.2.5-1.el7.x86_64.rpm is not installed
Importing GPG key 0xF4A80EB5:
 Userid     : "CentOS-7 Key (CentOS 7 Official Signing Key) <security@centos.org>"
 Fingerprint: 6341 ab27 53d7 8a78 a7c2 7bb1 24c6 a8a7 f4a8 0eb5
 Package    : centos-release-7-6.1810.2.el7.centos.x86_64 (@CentOS)
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
# yum install -q -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm; percona-release setup ps80
* Enabling the Percona Original repository
<*> All done!
The percona-release package now contains a percona-release script that can enable additional repositories for our newer products.
For example, to enable the Percona Server 8.0 repository use:
  percona-release setup ps80
Note: To avoid conflicts with older product versions, the percona-release setup command may disable our original repository for some products.
For more information, please visit:
  https://www.percona.com/doc/percona-repo-config/percona-release.html
* Disabling all Percona Repositories
* Enabling the Percona Server 8.0 repository
* Enabling the Percona Tools repository
<*> All done!
# yum install -q -y percona-server-client
warning: /var/cache/yum/x86_64/7/ps-80-release-x86_64/packages/percona-server-shared-8.0.15-5.1.el7.x86_64.rpm: Header V4 RSA/SHA256 Signature, key ID 8507efa5: NOKEY
Public key for percona-server-shared-8.0.15-5.1.el7.x86_64.rpm is not installed
Importing GPG key 0x8507EFA5:
 Userid     : "Percona MySQL Development Team (Packaging key) <mysql-dev@percona.com>"
 Fingerprint: 4d1b b29d 63d9 8e42 2b21 13b1 9334 a25f 8507 efa5
 Package    : percona-release-1.0-11.noarch (installed)
 From       : /etc/pki/rpm-gpg/PERCONA-PACKAGING-KEY

Next we will create a client config to save some typing later on and then check that we can connect to the MySQL instance that is already running; if you don’t have MySQL running on your local machine then you can install it in the container.

# cat <<EOS > /root/.my.cnf
> [client]
> user = testme
> password = my-secret-pw
> host = 127.0.0.1
> protocol = tcp
> ssl-mode = REQUIRED
> EOS
mysql> /* hide passwords */ PAGER sed "s/AS '.*' REQUIRE/AS 'xxx' REQUIRE/" ;
PAGER set to 'sed "s/AS '.*' REQUIRE/AS 'xxx' REQUIRE/"'
mysql> SHOW CREATE USER CURRENT_USER();
CREATE USER 'testme'@'%' IDENTIFIED WITH 'caching_sha2_password' AS 'xxx' REQUIRE SSL PASSWORD EXPIRE DEFAULT ACCOUNT UNLOCK PASSWORD HISTORY DEFAULT PASSWORD REUSE INTERVAL DEFAULT
mysql> SELECT @@version, @@version_comment, ssl_version, ssl_cipher, user FROM sys.session_ssl_status INNER JOIN sys.processlist ON thread_id = thd_id AND conn_id = CONNECTION_ID();
8.0.12-1        Percona Server (GPL), Release '1', Revision 'b072566'   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256     testme@localhost
All looks good here, so now we will check that MySQLdb can also connect:
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib64/python2.7/site-packages/MySQLdb/__init__.py", line 81, in Connect
    return Connection(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 193, in __init__
    super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2059, "Authentication plugin 'caching_sha2_password' cannot be loaded: /usr/lib64/mysql/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory")

Changing the user’s authentication plugin

We have hit the first issue, because MySQL 8.0 introduced the caching_sha2_password plugin and made it the default authentication plugin, so we can’t connect at all. However, we can gain access by changing the grants for the user to use the old default plugin and then test again.

mysql> ALTER USER 'testme'@'%' IDENTIFIED WITH mysql_native_password BY "my-secret-pw";
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib64/python2.7/site-packages/MySQLdb/__init__.py", line 81, in Connect
    return Connection(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 193, in __init__
    super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (1045, "Access denied for user 'testme'@'localhost' (using password: YES)")

Configuring SSL options

We still can’t connect, so what can be wrong? Well, we haven’t added any SSL details to the config other than specifying that we need to use SSL, so we will add the necessary options and make sure that we can connect.

# cat <<EOS >> /root/.my.cnf
> ssl-ca = /root/certs/ca.pem
> ssl-cert = /root/certs/client-cert.pem
> ssl-key = /root/certs/client-key.pem
> EOS
# mysql -Bse "select 1"
1
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
(('8.0.12-1', "Percona Server (GPL), Release '1', Revision 'b072566'", 'TLSv1', 'ECDHE-RSA-AES256-SHA', 'testme@localhost'),)

Forcing TLSv1.2 or later to further secure connections

That looks much better now, but if you look closely then you will notice that the MySQLdb connection is using TLSv1, which will make your security team either sad, angry or perhaps both as the connection can be downgraded! We want to secure the installation and keep data over the wire safe from prying eyes, so we will remove TLSv1 and also TLSv1.1 from the list of versions accepted by MySQL and leave only TLSv1.2 (TLSv1.3 is not supported with our current MySQL 8.0 and OpenSSL versions). Any guesses what will happen now?

mysql> SET PERSIST_ONLY tls_version = "TLSv1.2"; RESTART;
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/lib64/python2.7/site-packages/MySQLdb/__init__.py", line 81, in Connect
    return Connection(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 193, in __init__
    super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2026, 'SSL connection error: error:00000001:lib(0):func(0):reason(1)')

MySQLdb can no longer connect to MySQL, sadly with a rather cryptic message! However, as we made the change that triggered this we don’t need to decipher it and can now start looking to add TLSv1.2 support to MySQLdb, so roll up your sleeves!

Solution: Build a new RPM

In order to build a new RPM we will need to do a little extra work in the container first of all, but it doesn’t take long and is pretty simple to do. We are going to install the necessary packages to create a basic build environment and then rebuild the MySQL-python RPM from its current source RPM.

## Download packages
# yum install -q -y rpm-build yum-utils gnupg2 rsync deltarpm gcc
Package yum-utils-1.1.31-50.el7.noarch already installed and latest version
Package gnupg2-2.0.22-5.el7_5.x86_64 already installed and latest version
Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
## Create build directory tree
# install -d /usr/local/builds/rpmbuild/{BUILD,RPMS,SOURCES,SPECS,SRPMS}
## Configure the RPM macro
# echo "%_topdir /usr/local/builds/rpmbuild" > ~/.rpmmacros
## Switch to a temporary directory to ease cleanup
# cd "$(mktemp -d)"; pwd
/tmp/tmp.xqxdBeLkXs
## Download the source RPM
# yumdownloader --source -q MySQL-python
Enabling updates-source repository
Enabling base-source repository
Enabling extras-source repository
## Extract the source RPM
# rpm2cpio "$(ls -1 MySQL-python*src.rpm)" | cpio -idmv
MySQL-python-1.2.5.zip
MySQL-python-no-openssl.patch
MySQL-python.spec
234 blocks

We are now ready to start making some changes to the source code and build specifications, but first of all we need to take note of another change that occurred in MySQL 8.0. Older code will reference my_config.h, which has since been removed and is no longer required for building; the fix for this will be shown below.

## Adjust the spec file to use percona-server-devel and allow a build number
# sed -i "s/mysql-devel/percona-server-devel/g; s/Release: 1/Release: %{buildno}/" MySQL-python.spec
## Store the ZIP filename and extract
# SRC_ZIP="$(ls -1 MySQL-python*.zip)"; unzip "${SRC_ZIP}"
## Store the source directory and remove the include of my_config.h
# SRC_DIR=$(find . -maxdepth 1 -type d -name "MySQL-python*"); sed -i 's/#include "my_config.h"/#define NO_MY_CONFIG/' "${SRC_DIR}/_mysql.c"
## View our _mysql.c change
# fgrep -m1 -B3 -A1 -n NO_MY_CONFIG "${SRC_DIR}/_mysql.c"
41-#if defined(MS_WINDOWS)
42-#include
43-#else
44:#define NO_MY_CONFIG
45-#endif
## Update the source
# zip -uv "${SRC_ZIP}" "${SRC_DIR}/_mysql.c"
updating: MySQL-python-1.2.5/_mysql.c   (in=84707) (out=17314) (deflated 80%)
total bytes=330794, compressed=99180 -> 70% savings

Now that we have adjusted the source code and specification we can start work on the new package so that we can once again connect to MySQL.

## Sync the source to the build tree
# rsync -ac ./ /usr/local/builds/rpmbuild/SOURCES/
## Copy the new specification file to the build tree
# cp -a MySQL-python.spec /usr/local/builds/rpmbuild/SPECS/
## Build a new source RPM with our updates
# rpmbuild --define "buildno 2" -bs /usr/local/builds/rpmbuild/SPECS/MySQL-python.spec
Wrote: /usr/local/builds/rpmbuild/SRPMS/MySQL-python-1.2.5-2.el7.src.rpm
## Use the new source RPM to install any missing dependencies
# yum-builddep -q -y /usr/local/builds/rpmbuild/SRPMS/MySQL-python-1.2.5-2.el7.src.rpm &>>debug.log
## Build a new RPM
# rpmbuild --define "buildno 2" --rebuild /usr/local/builds/rpmbuild/SRPMS/MySQL-python-1.2.5-2.el7.src.rpm &>>debug.log
# tail -n1 debug.log
+ exit 0

All went well and so we can now install the new package and see if it worked.

## Install the local RPM
# yum localinstall -q -y /usr/local/builds/rpmbuild/RPMS/x86_64/MySQL-python-1.2.5-2.el7.x86_64.rpm
## Check to see which libmysqlclient is used
# ldd /usr/lib64/python2.7/site-packages/_mysql.so | fgrep libmysqlclient
        libmysqlclient.so.21 => /usr/lib64/mysql/libmysqlclient.so.21 (0x00007f61660be000)
## Test the connection
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
(('8.0.12-1', "Percona Server (GPL), Release '1', Revision 'b072566'", 'TLSv1.2', 'ECDHE-RSA-AES128-GCM-SHA256', 'testme@localhost'),)

Almost there… now force user authentication with the caching_sha2_password plugin

Hurrah! We can once again connect to the database and this time we are using TLSv1.2 and have thus increased our security. There is one thing left to do now though. Earlier on we needed to change the authentication plugin, so we will now change it back for extra security and see if all is still well.

mysql> ALTER USER 'testme'@'%' IDENTIFIED WITH caching_sha2_password BY "my-secret-pw";
# /usr/bin/python -c "import MySQLdb; dbc=MySQLdb.connect(read_default_file='~/.my.cnf').cursor(); dbc.execute('select @@version, @@version_comment, ssl_version, ssl_cipher, user from sys.session_ssl_status inner join sys.processlist on thread_id = thd_id and conn_id = connection_id()'); print dbc.fetchall()"
(('8.0.12-1', "Percona Server (GPL), Release '1', Revision 'b072566'", 'TLSv1.2', 'ECDHE-RSA-AES128-GCM-SHA256', 'testme@localhost'),)

Mission successful! Hopefully, if you are finding yourself in need of a little extra time migrating away from MySQLdb then this will help.


Image modified from photo by David Clode on Unsplash

Feb
11
2019
--

Google Docs gets an API for task automation

Google today announced the general availability of a new API for Google Docs that will allow developers to automate many of the tasks that users typically do manually in the company’s online office suite. The API has been in developer preview since last April’s Google Cloud Next 2018 and is now available to all developers.

As Google notes, the REST API was designed to help developers build workflow automation services for their users, build content management services and create documents in bulk. Using the API, developers can also set up processes that manipulate documents after the fact to update them, and the API also features the ability to insert, delete, move, merge and format text, insert inline images and work with lists, among other things.

The canonical use case here is invoicing, where you need to regularly create similar documents with ever-changing order numbers and line items based on information from third-party systems (or maybe even just a Google Sheet). Google also notes that the API’s import/export abilities allow you to use Docs for internal content management systems.

Some of the companies that built solutions based on the new API during the preview period include Zapier, Netflix, Mailchimp and Final Draft. Zapier integrated the Docs API into its own workflow automation tool to help its users create offer letters based on a template, for example, while Netflix used it to build an internal tool that helps its engineers gather data and automate its documentation workflow.

 

 

Jun
22
2018
--

Security, privacy experts weigh in on the ICE doxxing

In what appears to be the latest salvo in a new, wired form of protest, developer Sam Lavigne posted code that scrapes LinkedIn to find Immigration and Customs Enforcement employee accounts. His code, which basically a Python-based tool that scans LinkedIn for keywords, is gone from Github and Gitlab and Medium took down his original post. The CSV of the data is still available here and here and WikiLeaks has posted a mirror.

“I find it helpful to remember that as much as internet companies use data to spy on and exploit their users, we can at times reverse the story, and leverage those very same online platforms as a means to investigate or even undermine entrenched power structures. It’s a strange side effect of our reliance on private companies and semi-public platforms to mediate nearly all aspects of our lives. We don’t necessarily need to wait for the next Snowden-style revelation to scrutinize the powerful — so much is already hiding in plain sight,” said Lavigne.

Doxxing is the process of using publicly available information to target someone online for abuse. Because we can now find out anything on anyone for a few dollars – a search for “background check” brings up dozens of paid services that can get you names and addresses in a second – scraping public data on LinkedIn seems far easier and innocuous. That doesn’t make it legal.

“Recent efforts to outlaw doxxing at the national level (like the Online Safety Modernization Act of 2017) have stalled in committee, so it’s not strictly illegal,” said James Slaby, Security Expert at Acronis. “But LinkedIn and other social networks usually consider it a violation of their terms of service to scrape their data for personal use. The question of fairness is trickier: doxxing is often justified as a rare tool that the powerless can use against the powerful to call attention to perceived injustices.”

“The problem is that doxxing is a crude tool. The torrent of online ridicule, abuse and threats that can be heaped on doxxed targets by their political or ideological opponents can also rain down on unintended and undeserving targets: family members, friends, people with similar names or appearances,” he said.

The tool itself isn’t to blame. No one would fault a job seeker or salesperson who scraped LinkedIn for targeted employees of a specific company. That said, scraping and publicly shaming employees walks a thin line.

“In my opinion, the professor who developed this scraper tool isn’t breaking the law, as it’s perfectly legal to search the web for publicly available information,” said David Kennedy, CEO of TrustedSec. “This is known in the security space as ‘open source intelligence’ collection, and scrapers are just one way to do it. That said, it is concerning to see ICE agents doxxed in this way. I understand emotions are running high on both sides of this debate, but we don’t want to increase the physical security risks to our law enforcement officers.”

“The decision by Twitter, Github and Medium to block the dissemination of this information and tracking tool makes sense – in fact, law enforcement agents’ personal information is often protected. This isn’t going to go away anytime soon, it’s only going to become more aggressive, particularly as more people grow comfortable with using the darknet and the many available hacking tools for sale in these underground forums. Law enforcement agents need to take note of this, and be much more careful about what (and how often) they post online.”

Ultimately, doxxing is problematic. Because we place our information on public forums there should be nothing to stop anyone from finding and posting it. However, the expectation that people will use our information for good and not evil is swiftly eroding. Today, wrote one security researcher, David Kavanaugh, doxxing is becoming dangerous.

“Going after the people on the ground is like shooting the messenger. Decisions are made by leadership and those are the people we should be going after. Doxxing is akin to a personal attack. Change policy, don’t ruin more lives,” he said.

Mar
25
2015
--

Yelp IT! A talk with 3 Yelp MySQL DBAs on Percona Live & more

elp IT! A talk with 3 Yelp MySQL DBAs heading to Percona Live 2015Founded in 2004 to help people find great local businesses, Yelp has some 135 million monthly unique visitors. With those traffic volumes Yelp’s 300+ engineers are constantly working to keep things moving smoothly – and when you move that fast you learn many things.

Fortunately for the global MySQL community, three Yelp DBAs will be sharing what they’ve learned at the annual Percona Live MySQL Conference and Expo this April 13-16 in Santa Clara, California.

Say “hello” to Susanne Lehmann, Jenni Snyder and Josh Snyder! I chatted with them over email about their presentations, on how MySQL is used at Yelp, and about the shortage of women in MySQL.

***

Tom: Jenni, you and Josh will be co-presenting “Next generation monitoring: moving beyond Nagios ” on April 14.

You mentioned that Yelp’s databases scale dynamically, and so does your monitoring of those databases. And to minimize human intervention, you’ve created a Puppet and Sensu monitoring ensemble… because “if it’s not monitored, it’s not in production.” Talk to me more about Yelp’s philosophy of “opt-out monitoring.” What does that entail? How does that help Yelp?

Jenni: Before we moved to Sensu, our Nagios dashboards were a sea of red, muted, acknowledged, or disabled service checks. In fact, we even had a cluster check to make sure that we never accidentally put a host into use that was muted or marked for downtime. It was possible for a well-meaning operator to acknowledge checks on a host and forget about it, and I certainly perpetrated a couple of instances of disks filling up after acknowledging a 3am “warning” page that I’d rather forget about. With Sensu, hosts and services come out of the downtime/acknowledgement state automatically after a number of days, ensuring that we’re kept honest and stay on top of issues that need to be addressed.

Also, monitoring is deployed with a node, not separate monitoring configuration. Outside of a grace period we employ when a host is first provisioned or rebooted, if a host is up, it’s being monitored and alerting. Also, alerting doesn’t always mean paging. We also use IRC and file tickets directly into our tracking system when we don’t need eyes on a problem right away.


Tom: Susanne, in your presentation, titled “insert cassandra into prod where use_case=?;” you’ll discuss the situations you’ve encountered where MySQL just wasn’t the right tool for the job.

What led up to that discovery and how did you come up with finding the right tools (and what were they) to run alongside and support MySQL?

Susanne: Our main force behind exploring other datastores alongside MySQL was that Yelp is growing outside the US market a lot. Therefore we wanted the data to be nearer to the customer and needed multi-master writes.

Also, we saw use cases where our application data was organized very key-value like and not relational, which made them a better fit for a NoSQL solution.

We decided to use Cassandra as a datastore and I plan to go more into detail why during my talk. Now we offer developers more choices on how to store our application data, but we also believe in the “right tool for the job” philosophy and might add more solutions to the mix in the future.


Tom: Jenni, you’ll also be presenting “Schema changes multiple times a day? OK!” I know that you and your fellow MySQL DBAs are always improving and also finding better ways of supporting new and existing features for Yelp users like me. Delivering on such a scale must entail some unique processes and tools. Does this involve a particular mindset among your fellow DBAs? Also, what are some of those key tools – and processes and how are they used?

Jenni: Yelp prizes the productivity of our developers and our ability to iterate and develop new features quickly. In order to do that, we need to be able to not only create new database tables, but also modify existing ones, many of which are larger than MySQL can alter without causing considerable replication delay. The first step is to foster a culture of automated testing, monitoring, code reviews, and partnership between developers and DBAs to ensure that we can quickly & safely roll out schema changes. In my talk, I’ll be describing tools that we’ve talked about before, like our Gross Query Checker, as well as the way the DBA team works with developers while still getting the rest of our work done. The second, easy part is using a tool like pt-online-schema-change to run schema changes online without causing replication delay or degrading performance :)


Tom: Josh, you’ll also be speaking on “Bootstrapping databases in a single command: elastic provisioning for the win.” What is “elastic provisioning” and how are you using it for Yelp’s tooling?

Josh: When I say that we use elastic provisioning, I mean that we can reliably and consistently build a database server from scratch, with minimal human involvement. The goal is to encompass every aspect of the provisioning task, including configuration, monitoring, and even load balancing, in a single thoroughly automated process. With this process in place, we’ve found ourselves able to quickly allocate and reallocate resources, both in our datacenters and in the cloud. Our tools for implementing the above goals give us greater confidence in our infrastructure, while avoiding single-points of failure and achieving the maximum possible level of performance. We had a lot of fun building this system, and we think that many of the components involved are relevant to others in the field.


Tom: Susanne and Jenni, last year at Percona Live there was a BoF session titled “MySQL and Women (or where are all the women?).” The idea was to discuss why there are “just not enough women working on the technology side of tech.” In a nutshell, the conversation focused on why there are not more women in MySQL and why so relatively few attend MySQL conferences like Percona Live.

The relative scarcity of women in technical roles was also the subject of an article published in the August 2014 issue of Forbes, citing a recent industry report.

Why, in your (respective) views, do you (or don’t) think that there are so few women in MySQL? And how can this trend be reversed?

Susanne: I think there are few women in MySQL and the reasons are manifold. Of course there is the pipeline problem. Then there is the problem, widely discussed right now, that women who are entering STEM jobs are less likely staying in there. These are reasons not specific for MySQL jobs, but rather for STEM in general. What is more specific for database/MySQL jobs is, in my opinion, that often times DBAs need to be on call, they need to stay in the office if things go sideways. Database problems tend often to be problems that can’t wait till the next morning. That makes it more demanding when you have a family for example (which is true for men as well of course, but seems still to be more of a problem for women).

As for how to reverse the trend, I liked this Guardian article because it covers a lot of important points. There is no easy solution.

I like that more industry leaders and technology companies are discussing what they can do to improve diversity these days. In general, it really helps to have a great professional (female) support system. At Yelp, we have AWE, the Awesome Women in Engineering group, in which Jenni and I are both active. We participate in welcoming women to Yelp engineering, speaking at external events and workshops to help other women present their work, mentoring, and a book club.

Jenni: I’m sorry that I missed Percona Live and this BoF last year; I was out on maternity leave. I believe that tech/startup culture is a huge reason that fewer women are entering and staying these days, but a quick web search will lead you to any number of articles debating the subject. I run into quite a few women working with MySQL; it’s large, open community and generally collaborative and supportive nature is very welcoming. As the article you linked to suggests, MySQL has a broad audience. It’s easy to get started with and pull into any project, and as a result, most software professionals have worked with it at some time or another.

On another note, I’m happy to see that Percona Live has a Code of Conduct. I hope that Percona and/or MySQL will consider adopting a Community Code of Conduct like Python, Puppet, and Ubuntu. Doing so raises the bar for all participants, without hampering collaboration and creativity!

* * *

Thanks very much, Susanne, Jenni and Josh! I look forward to seeing you next month at the conference. And readers, if you’d like to attend Percona Live, use the promo code Yelp15 for 15% off! Just enter that during registration. If you’re already attending, be sure to tweet about your favorite sessions using the hashtag #PerconaLive. And if you need to find a great place to eat while attending Percona Live, click here for excellent Yelp recommendations. :)

The post Yelp IT! A talk with 3 Yelp MySQL DBAs on Percona Live & more appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com