Jan
31
2017
--

Tact brings intelligent sales assistant to new Slack enterprise edition

Business people standing in conference room shaking hands Tact, the AI-driven sales tool that uses software smarts to simplify sales tasks, today announced integration with the newly announced Slack enterprise product, Slack Enterprise Grid. Tact, which currently has a mobile app and Amazon Echo integration (along with in-car integration in private Beta), sees messaging tools like Slack as a logical extension of the platform. It can help automate… Read More

Jan
31
2017
--

Slack takes aim at the corporate sector with Enterprise Grid, adds bots from SAP

enterpriseasset-01 Slack, the business app that lets teams of users communicate, share files from other services, and work on them with each other, has taken off like wildfire since launching three years ago, with 5 million daily users, 1.5 million of them paying today. Now, Slack is embarking on the next step in its ambition to be the go-to platform for all workplace collaboration, no matter how big the… Read More

Jan
31
2017
--

Google will soon open-source Google Earth Enterprise

2017-01-31_0935 Google Earth Enterprise, which originally launched over ten years ago, was Google’s tool for businesses that wanted to build and host private versions of Google Earth and Google Maps for their internal geospatial applications. In 2015, the company announced that it would shut the service down in March 2017 but in what is becoming a pretty standard move for deprecated products, Google… Read More

Jan
31
2017
--

Docker Security Vulnerability CVE-2016-9962

CVE-2016-9962

CVE-2016-9962Docker 1.12.6 was released to address CVE-2016-9962. CVE-2016-9962 is a serious vulnerability with RunC.

Quoting the coreos page (linked above):

“RunC allowed additional container processes via runc exec to be ptraced by the pid 1 of the container. This allows the main processes of the container, if running as root, to gain access to file-descriptors of these new processes during the initialization and can lead to container escapes or modification of runC state before the process is fully placed inside the container.”

In short, IF processes run as root inside a container they could potentially break out of the container and gain access over the host.

My recommendation at this time is to apply the same basic security tenants for containers as you would (I hope) for VM and baremetal installs. In other words, ensure you are adhering to a Path of Least Privilege as a best practice and not running as root for conevience’s sake.

Prior to this, we made changes to PMM prior to version 1.0.4 to reduce the number of processes within the container that ran as root. As such, only the processes required to do so run as root. All other processes run as a lower privilege user.

Check here for documentation on PMM, and use the JIRA project to raise bugs (JIRA requires registration).

To comment on running a database within docker, I’ve reviewed the following images

  • percona-server image: I have verified it does not run as root, and runs as a mysql user (for 5.7.16 at least)
  • percona-server-mongodb: I have worked with our teams internally and can confirm that the latest image no longer runs as root (you will to run the latest image, however, to see this change via docker pull)

Please comment below with any questions.

Jan
31
2017
--

Zenefits launches new tools for employee compliance and looks to simplify pricing

zenefits documents app Zenefits today said it would be introducing new pricing tiers in an effort to simplify them, as well as add new tools for HR managers to ensure that employees are getting their documents in and they are compliant with regulations. The new tools — and pricing — are another way that Zenefits is trying to differentiate itself as an all-in-one platform for managing employee records… Read More

Jan
31
2017
--

Google gives G Suite admins more control over security

securityicons Over three million businesses now pay for Google’s G Suite productivity apps. Today, the company is launching a number of new security features that aim to keep these businesses’ data safe on its platform. Admins can now, for example, force their users to use physical security keys from companies like Yubico to access their data. They will also be able to manage the deployment of… Read More

Jan
30
2017
--

MySQL Sharding Models for SaaS Applications

MySQL Sharding Models

MySQL Sharding ModelsIn this blog post, I’ll discuss MySQL sharding models, and how they apply to SaaS application environments.

MySQL is one of the most popular database technologies used to build many modern SaaS applications, ranging from simple productivity tools to business-critical applications for the financial and healthcare industries.

Pretty much any large scale SaaS application powered by MySQL uses sharding to scale. In this blog post, we will discuss sharding choices as they apply to these kinds of applications.

In MySQL, unlike in some more modern technologies such as MongoDB, there is no standard sharding implementation that the vast majority of applications use. In fact, if anything “no standard” is the standard. The common practice is to roll your own sharding framework, as famous MySQL deployments such as Facebook and Twitter have done. MySQL Cluster – the MySQL software that has built-in Automatic Sharding functionality – is rarely deployed (for a variety of reasons). MySQL Fabric, which has been the official sharding framework, has no traction either.

When sharding today, you have a choice of rolling your own system from scratch, using comprehensive sharding platform such as Vitess or using a proxy solution to assist you with sharding. For proxy solutions, MySQL Router is the official solution. But in reality, third party solutions such as open source ProxySQL, commercial ScaleArc and semi-commercial (BSL)  MariaDB MaxScale are widely used. Keep in mind, however, that traffic routing is only one of the problems that exist in large scale sharding implementations.

Beneath all these “front end” choices for sharding on the application database connection framework or database proxy, there are some lower level decisions that you’ve got to make. Namely, around how your data is going to be led out and organized on the MySQL nodes.

When it comes to SaaS applications, at least one answer is simple. It typically makes sense to shard your data by “customer” or “organization” using some sort of mapping tables. In the vast majority of cases, single node (or replicated cluster) should be powerful enough to handle all the data and load coming from each customer.

What Should I Ask Myself Now?

The next set questions you should ask yourself are around your SaaS applications:

  • How much revenue per customer are you generating?
  • Do your customers (or regulations) require data segregation?
  • Are all the customers about the same, or are there outliers?
  • Are all your customers running the same database schema?

I address the answers in the sections below.

How Much Revenue?

How much revenue per customer you’re generating is an important number. It defines how much infrastructure costs per customer you can afford. In the case of “freemium” models, and customers generating less than $1 a month an average, you might need to ensure low overhead per customer (even if you have to compromise on customer isolation).

How much revenue per customer you’re generating is an important number. It defines how much infrastructure costs per customer you can afford. In the case of “freemium” models, and customers generating less than $1 a month an average, you might need to ensure low overhead per customer (even if you have to compromise on customer isolation).

Typically with low revenue customers, you have to co-locate the data inside the same MySQL instance (potentially even same tables). In the case of high revenue customers, isolation in separate MySQL instances (or even containers or virtualized OS instances) might be possible.

Data Segregation?

Isolation is another important area of consideration. Some enterprise customers might require that their data is physically separate from others. There could also be government regulations in play that require customer data to be stored in a specific physical location. If this is the case, you’re looking at completely dedicated customer environments. Or at the very least, separate database instances (which come with additional costs).

Customer Types?

Customer size and requirements are also important. A system designed to handle all customers of approximately the same scale (for example, personal accounting) is going to be different than if you are in the business of blog hosting. Some blogs might be 10,000 times more popular than the average.

Same Database Schema?

Finally, there is a there is the big question of whether all your customers are running the same database schema and same software version. If you want to support different software versions (if your customers require a negotiated maintenance window for software upgrades, for example) or different database schemas (if the schema is dependent on the custom functionality and modules customers might use, for example), keeping such customers in different MySQL schemas make sense.

Sharding Models

This gets us to the following sharding isolation models, ranging from lowest to highest:

  • Customers Share Schemas. This is the best choice when you have very large numbers of low-revenue customers. In this case, you would map multiple customers to the same set of tables, and include something like a customer_id field in them to filter customer data. This approach minimizes customer overhead and reduces customer isolation. It’s harder to backup/restore data for individual customers, and it is easier to introduce coding mistakes that can access other customers data. This method does not mean there is only one schema, but that there is a one-to-many relationship between schemas and customers.  For example, you might have 100 schema’s per MySQL instance, each handling 1000 to 10000 customers (depending on the application). Note that with a well-designed sharding implementation, you should be able to map customers individually to schemas. This allows you to have key customer data stored in dedicated schemas, or even on dedicated nodes.
  • Schema per Customer. This is probably the most common sharding approach in MySQL powered SaaS applications. Especially ones that have substantial revenue ($10+ per month / per customer). In this model, each customer’s data is stored in its own schema (database). This makes it very easy to backup/restore individual customers. It allows customers to have different schemas (i.e., add custom tables). It also allows them to run different versions of the application if desired. This approach allows the application server to use different MySQL users connecting on behalf of different customers, which adds an extra level of protection from accidental (or intentional) access of data that belongs to different customers. The schema per customer approach also makes it easier to move the shards around, and limits maintenance impact. The downside of this approach is higher overhead. It also results in a large number of tables per instance, and potentially larger numbers of files (which can be hard to manage).
  • Database Instance per Customer. You achieve even better isolation by having a MySQL instance per customer. This approach, however, increases overhead even further. The recent rise of light virtualization technologies and containers has reduced its usage.
  • OS Instance/Container per Customer. This approach allows you to improve isolation even further. It can be used for any customer, but can also be applied to selected customers in a model that uses Schema per Customer model for a majority of them.  Dedicated OS Instance, with improved isolation and better performance SLAs, might be a feature of some premium customer tiers. This method not only allows better isolation, but it also let’s you handle outliers better. You might chose to run a majority of your customers on the hardware (or cloud instance) that has best price/performance numbers, and also place some of the larger customers on the highest performance nodes.
  • Environment per customer. Finally, if you take this all the way you can build completely separate environments for customers. This includes databases, application servers and other required components. This is especially useful if you need to deploy the application close to the customer – which includes the appliance model, or deployment in the customer’s data center or cloud provider. This also allows you to accommodate customers if their data must be stored in a specific location. This is often due to government regulations. It is worth noting that many SaaS applications, even if they do not quite have one environment per customer, have multiple independent environments. These are often hosted in different locations or availability zones. Such setups allow you to reduce the impact of large-scale failures to only a portion of your customers. This avoids overloading your customer service group and allowing the operational organization to focus on repairing smaller environments.

The farther you go down this route – from the shared schema to an environment per customer – the more important is to have a high level of automation. With a shared schema, you often can get by with little automation (and some environments manually set up) and all the schema’s pre-created. If customer sign up requires setting up dedicated database instance or the whole environment, manual implementation doesn’t scale. For this type of setup, you need state-of-the-art automation and orchestration.

Conclusion

I hope this helps you to understand your options for MySQL sharding models. Each of the different sharding models for SaaS applications powered by MySQL have benefits and drawbacks. As you can see, many of these approaches require you to work with a large number of tables in the MySQL – this will be the topic of one of my next posts!

Jan
30
2017
--

MXNet accepted to the Apache Incubator

Connecting lines, computer illustration. MXNet, Amazon Web Services’ preferred deep learning framework, was accepted to the Apache Incubator today. Admission to the incubator is the first step necessary for the open-source initiative to officially become part of the Apache Software Foundation. The Apache Software Foundation supports the efforts of thousands of developers maintaining open-source projects around the world.… Read More

Jan
30
2017
--

Dropbox’s note-taking app Paper launches globally in 21 languages

paperdoc Dropbox said today that it is rolling out Paper — its note-taking app that it’s emphasizing is a tool that’s built for managing workflow as well — globally.
In addition to the regular launch of Paper, the company said that users will also be able to automatically generate presentations and run them through Paper in their browsers. Radhakrishnan said that users were… Read More

Jan
30
2017
--

Dropbox’s Smart Sync lets users open a file stored only in the cloud like any normal file

dropbox smart sync Dropbox today released Smart Sync, its tool that allows users to access files stored online in Dropbox accounts automatically on a desktop without having the file stored locally. Previously dubbed Dropbox Infinite, Smart Sync gives businesses a way to share and access files without needing to have massive ones stored on their desktop. The idea is that businesses regularly deal with piles… Read More

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com