Jun
12
2019
--

Apollo raises $22M for its GraphQL platform

Apollo, a San Francisco-based startup that provides a number of developer and operator tools and services around the GraphQL query language, today announced that it has raised a $22 million growth funding round co-led by Andreessen Horowitz and Matrix Partners. Existing investors Trinity Ventures and Webb Investment Network also participated in this round.

Today, Apollo is probably the biggest player in the GraphQL ecosystem. At its core, the company’s services allow businesses to use the Facebook -incubated GraphQL technology to shield their developers from the patchwork of legacy APIs and databases as they look to modernize their technology stacks. The team argues that while REST APIs that talked directly to other services and databases still made sense a few years ago, it doesn’t anymore now that the number of API endpoints keeps increasing rapidly.

Apollo replaces this with what it calls the Data Graph. “There is basically a missing piece where we think about how people build apps today, which is the piece that connects the billions of devices out there,” Apollo co-founder and CEO Geoff Schmidt told me. “You probably don’t just have one app anymore, you probably have three, for the web, iOS and Android . Or maybe six. And if you’re a two-sided marketplace you’ve got one for buyers, one for sellers and another for your ops team.”

Managing the interfaces between all of these apps quickly becomes complicated and means you have to write a lot of custom code for every new feature. The promise of the Data Graph is that developers can use GraphQL to query the data in the graph and move on, all without having to write the boilerplate code that typically slows them down. At the same time, the ops teams can use the Graph to enforce access policies and implement other security features.

“If you think about it, there’s a lot of analogies to what happened with relational databases in the ’80s,” Schmidt said. “There is a need for a new layer in the stack. Previously, your query planner was a human being, not a piece of software, and a relational database is a piece of software that would just give you a database. And you needed a way to query that database, and that syntax was called SQL.”

Geoff Schmidt, Apollo CEO, and Matt DeBergalis, CTO

GraphQL itself, of course, is open source. Apollo is now building a lot of the proprietary tools around this idea of the Data Graph that make it useful for businesses. There’s a cloud-hosted graph manager, for example, that lets you track your schema, as well as a dashboard to track performance, as well as integrations with continuous integration services. “It’s basically a set of services that keep track of the metadata about your graph and help you manage the configuration of your graph and all the workflows and processes around it,” Schmidt said.

The development of Apollo didn’t come out of nowhere. The founders previously launched Meteor, a framework and set of hosted services that allowed developers to write their apps in JavaScript, both on the front-end and back-end. Meteor was tightly coupled to MongoDB, though, which worked well for some use cases but also held the platform back in the long run. With Apollo, the team decided to go in the opposite direction and instead build a platform that makes being database agnostic the core of its value proposition.

The company also recently launched Apollo Federation, which makes it easier for businesses to work with a distributed graph. Sometimes, after all, your data lives in lots of different places. Federation allows for a distributed architecture that combines all of the different data sources into a single schema that developers can then query.

Schmidt tells me the company started to get some serious traction last year and by December, it was getting calls from VCs that heard from their portfolio companies that they were using Apollo.

The company plans to use the new funding to build out its technology to scale its field team to support the enterprises that bet on its technology, including the open-source technologies that power both the services.

“I see the Data Graph as a core new layer of the stack, just like we as an industry invested in the relational database for decades, making it better and better,” Schmidt said. “We’re still finding new uses for SQL and that relational database model. I think the Data Graph is going to be the same way.”

Jun
10
2019
--

Qubole launches Quantum, its serverless database engine

Qubole, the data platform founded by Apache Hive creator and former head of Facebook’s Data Infrastructure team Ashish Thusoo, today announced the launch of Quantum, its first serverless offering.

Qubole may not necessarily be a household name, but its customers include the likes of Autodesk, Comcast, Lyft, Nextdoor and Zillow . For these users, Qubole has long offered a self-service platform that allowed their data scientists and engineers to build their AI, machine learning and analytics workflows on the public cloud of their choice. The platform sits on top of open-source technologies like Apache Spark, Presto and Kafka, for example.

Typically, enterprises have to provision a considerable amount of resources to give these platforms the resources they need. These resources often go unused and the infrastructure can quickly become complex.

Qubole already abstracts most of this away, offering what is essentially a serverless platform. With Quantum, however, it is going a step further by launching a high-performance serverless SQL engine that allows users to query petabytes of data with nothing else but ANSI-SQL, giving them the choice between using a Presto cluster or a serverless SQL engine to run their queries, for example.

The data can be stored on AWS and users won’t have to set up a second data lake or move their data to another platform to use the SQL engine. Quantum automatically scales up or down as needed, of course, and users can still work with the same metastore for their data, no matter whether they choose the clustered or serverless option. Indeed, Quantum is essentially just another SQL engine without Qubole’s overall suite of engines.

Typically, Qubole charges enterprises by compute minutes. When using Quantum, the company uses the same metric, but enterprises pay for the execution time of the query. “So instead of the Qubole compute units being associated with the number of minutes the cluster was up and running, it is associated with the Qubole compute units consumed by that particular query or that particular workload, which is even more fine-grained,” Thusoo explained. “This works really well when you have to do interactive workloads.”

Thusoo notes that Quantum is targeted at analysts who often need to perform interactive queries on data stored in object stores. Qubole integrates with services like Tableau and Looker (which Google is now in the process of acquiring). “They suddenly get access to very elastic compute capacity, but they are able to come through a very familiar user interface,” Thusoo noted.

 

May
02
2019
--

Microsoft brings Azure SQL Database to the edge (and Arm)

Microsoft today announced an interesting update to its database lineup with the preview of Azure SQL Database Edge, a new tool that brings the same database engine that powers Azure SQL Database in the cloud to edge computing devices, including, for the first time, Arm-based machines.

Azure SQL Edge, Azure corporate vice president Julia White writes in today’s announcement, “brings to the edge the same performant, secure and easy to manage SQL engine that our customers love in Azure SQL Database and SQL Server.”

The new service, which will also run on x64-based devices and edge gateways, promises to bring low-latency analytics to edge devices as it allows users to work with streaming data and time-series data, combined with the built-in machine learning capabilities of Azure SQL Database. Like its larger brethren, Azure SQL Database Edge will also support graph data and comes with the same security and encryption features that can, for example, protect the data at rest and in motion, something that’s especially important for an edge device.

As White rightly notes, this also ensures that developers only have to write an application once and then deploy it to platforms that feature Azure SQL Database, good old SQL Server on premises and this new edge version.

SQL Database Edge can run in both connected and fully disconnected fashion, something that’s also important for many use cases where connectivity isn’t always a given, yet where users need the kind of data analytics capabilities to keep their businesses (or drilling platforms, or cruise ships) running.

Aug
29
2017
--

Salesforce is using AI to democratize SQL so anyone can query databases in natural language

 SQL is about as easy as it gets in the world of programming, and yet its learning curve is still steep enough to prevent many people from interacting with relational databases. Salesforce’s AI research team took it upon itself to explore how machine learning might be able to open doors for those without knowledge of SQL. Read More

May
16
2017
--

With version 2.0, Crate.io’s database tools put an emphasis on IoT

 Crate.io, the winner of our Disrupt Europe 2014 Battlefield, is launching version 2.0 of its CrateDB database today. The tool, which is available in both an open source and enterprise version, started out as a general-purpose but highly scalable SQL database. Over time, though, the team found that many of its customers were using the service for managing their machine data. Read More

Mar
27
2017
--

What’s Next for SQL Databases?

SQL Databases

SQL DatabasesIn this blog, I’ll go over my thoughts on what we can expect in the world of SQL databases.

After reading Baron’s prediction on databases, here:

https://www.xaprb.com/blog/defining-moments-in-database-history/

I want to provide my own view on what’s coming up next for SQL databases. I think we live in interesting times, when we can see the beginning of the next-generation of RDBMSs.

There are defining characteristics of such databases:

  1. Auto-scaling. The ability to add and use resources depending on the current load and database size. This is done transparently for users and DBAs.
  2. Auto-healing. The automatic handling of node failures.
  3. Multi-regional, cloud-agnostic, geo-distributed. The ability to support multiple data centers and multiple clouds, in different parts of the world.
  4. Transactional. All the above, with the ability to support multi-statements transactional workloads.
  5. Strong consistency. The full definition of strong consistency is pretty involved. For simplicity, let’s say it means that reads (in the absence of ongoing writes) will return the same data, despite what region or data center you are getting it from. A simple counter-example is the famous MySQL asynchronous replication, where (with the slave delay) reading the data on a slave can return very outdated data. I am focusing on reads, because in a distributed environment the consistent reads performance will be affected. This is where network latency (often limited by the speed of light) will define performance.
  6. SQL language. SQL, despite being old and widely criticized, is not going anywhere. This is a universal language for app developers to access data.

With this, I see following interesting projects:

  • Google Cloud Spanner (https://cloud.google.com/spanner/). Recently announced and still in the Beta stage. Definitely an interesting projects, with the obvious limitation of running only in Google Cloud.
  • FaunaDB (https://fauna.com/). Also very recently announced, so it is hard to say how it performs. The major downside I see is that it does not provide SQL access, but uses a custom language.
  • Two open source projects:
    • CockroachDB (https://www.cockroachlabs.com/). This is still in the Beta stage, but definitely an interesting project to follow. Initially, the project planned to support only key-value access, but later they made a very smart decision to provide SQL access via a PostgreSQL-compatible protocol.
    • TiDB (https://github.com/pingcap/tidb). Right now in RC stages, and the target is to provide SQL access over a MySQL compatible protocol (and later PostgreSQL protocol).

Protocol compatibility is a wise approach, although not strictly necessary. It lowers an entry barrier for the existing applications.

Both CockroachDB and TiDB, at the moment of this writing, still have rough edges and can’t be used in serious deployments (from my experience). I expect both projects will make a big progress in 2017.

What shared characteristics can we expect from these systems?

As I mentioned above, we may see that the read performance is degraded (as latency increases), and often it will be defined more by network performance than anything else. Storage IO and CPU cycles will be secondary factors. There will be more work on how to understand and tune the network traffic.

We may need to get used to the fact that point or small range selects become much slower. Right now, we see very fast point selects for traditional RDBM (MySQL, PostgreSQL, etc.).

Heavy writes will be problematic. The problem is that all writes will need to go through the consistency protocol. Write-optimized storage engines will help (both CockroachDB and TiDB use RocksDB in the storage layer).

The long transactions (let’s say changing 100000 or more rows) also will be problematic. There is just too much network round-trips and housekeeping work on each node, making long transactions an issue for distributed systems.

Another shared property (at least between CockroachDB and TiDB) is the active use of the Raft protocol to achieve consistency. So it will be important to understand how this protocol works to use it effectively. You can find a good overview of the Raft protocol here: http://container-solutions.com/raft-explained-part-1-the-consenus-problem/.

There probably are more NewSQL technologies than I have mentioned here, but I do not think any of them captured critical market- or mind-share. So we are at the beginning of interesting times . . .

What about MySQL? Can MySQL become the database that provides all these characteristics? It is possible, but I do not think it will happen anytime soon. MySQL would need to provide automatic sharding to do this, which will be very hard to implement given the current internal design. It may happen in the future, though it will require a lot of engineering efforts to make it work properly.

Dec
29
2016
--

Query Language Type Overview

Query Language Type

Query Language TypeThis blog provides a query language type overview.

The idea for this blog originated from some customers asking me questions. When working in a particular field, you often a dedicated vocabulary that makes sense to your peers. It often includes phrases and abbreviations because it’s efficient. It’s no different in the database world. Much of this language might make sense to DBA’s, but it might sound like “voodoo” to people not used to it. The overview below covers the basic types of query languages inside SQL. I hope it clarifies what they mean, how they’re used and how you should interpret them.

DDL (Data Definition Language)

A database schema is a visualization of information. It contains the data structure separated by tables structures, views and anything that contains structure for your data. It defines how you want to store and visualize the information.

It’s like a skeleton, defining how data is organized. Any action that creates/updates/changes this skeleton is DDL.

Do you remember spreadsheets? A table definition describes something like:

Account number Account name Account owner Creation date Amount
Sorted ascending Unique, indexed Date, indexed Number, linked with transactions


Whenever you want to create a table like this, you must use a DDL query. For example:

CREATE TABLE Accounts
(
Account_number Bigint(16) ,
Account_name varchar(255),
Account_name varchar(255),
Creation_date date,
Amount Bigint(16),
PRIMARY KEY (Account_number),
UNIQUE(Account_name),
FOREIGN KEY (Amount) REFERENCES transactions(Balancevalue)
);

CREATE, ALTER, DROP, etc.: all of these types of structure modification queries are DDL queries!

Defining the structure of the tables is important as this defines how you would potentially access the information stored in the database while also defining how you might visualize it.

Why should you care that much?

DDL queries define the structure on which you develop your application. Your structure will also define how the database server searches for information in a table, and how it is linked to other tables (using foreign keys, for example).

You must design your MySQL schema before adding information to it (unlike NoSQL solutions such as MongoDB). MySQL might be more rigid in this manner, but it often makes sense to design the pattern for how you want to store your information and query it properly.

Due to the rigidity of an RDBMS system, changing the data structure (or table schema) requires the system to rebuild the actual table in most cases. This is potentially problematic for performance or table availability (locking). Often this is a “hot” procedure (since MySQL 5.6), requiring no downtime for active operations. Additionally, tools like pt-osc or other open source solutions can be used for migrating the data structure to a new format without requiring downtime.

An example:

ALTER TABLE accounts ADD COLUMN wienietwegisisgezien varchar(20)

DML (Data Manipulation Language)

Data manipulation is what it sounds like: working with information inside a structure. Inserting information and deleting information (adding rows, deleting rows) are examples of data manipulation.

An example:

INSERT into resto_visitor values(5,'Julian',’highway 5’,12);
UPDATE resto_visitor set name='Evelyn',age=17 where id=103;

Sure, but why should I use it?

Having a database environment makes no sense unless you insert and fetch information out of it. Remember that databases are plentiful in the world: whenever you click on a link on your favorite blog website, it probably means you are fetching information out of a database (and that data was at one time inserted or modified).

Interacting with a database requires that you write DML queries.

DCL (Data Control Language)

Data control language is anything that is used for administrating access to the database content. For example, GRANT queries:

GRANT ALL PRIVILEGES ON database.table to ‘jeffbridges’@’ourserver’;

Well that’s all fine, but why another subset “language” in SQL?

As a user of database environments, at some point you’ll get access permission from someone performing a DCL query. Data control language is used to define authorization rules for accessing the data structures (tables, views, variables, etc.) inside MySQL.

TCL (Transaction Control Language) Queries

Transaction control language queries are used to control transactional processing in a database. What do we mean by transactional processes? Transactional processes are typically bundled DML queries. For example:

BEGIN
FETCH INFORMATION OF TABLE B
INSERT DATA INTO A
REMOVE STALE DATA FROM B
COMMIT or ROLLBACK

This gives you the ability to perform or rollback a complete action. Only storage engines offering transaction support (like InnoDB) can work with TCL.

Yet another term, but why?

Ever wanted to combine information and perform it as one transaction? In some circumstances, for example, it makes sense to make sure you perform an insert first and then perform an update. If you don’t use transactions, the insert might fail and the associated update might be an invalid entry. Transactions make sure that either the complete transaction (a group of DML queries) takes place, or it’s completely rolled back (this is also referred to as atomicity).

Conclusion

Hopefully this blog post helps you understand some of the “insider” database speech. Post comments below.

Mar
10
2016
--

With SQL Server 2016, Microsoft focuses on speed, security and luring customers away from Oracle

data_Illustration_cloud A few days ago, Microsoft shocked us when it announced that it would soon bring its SQL Server database to Linux. It’ll take until 2017 before SQL Server will be available on Linux, though. Until then, the company’s database focus remains squarely on the upcoming release of SQL Server 2016, which is currently available as a release candidate and which will become generally… Read More

Jul
07
2015
--

YC Alum PipelineDB Releases Open Source Streaming SQL Database

Underground pipes PipelineDB, a Y Combinator Winter 2014 graduate, announced the availability of the open source version of its streaming SQL database product today. A commercial version is expected later this year.
The product is an open-source database that runs SQL queries continuously in streams, incrementally storing the results in tables, company co-founder Derek Nelson explained.  “[The… Read More

Jun
25
2015
--

Oracle license revenue and the MySQL ecosystem

Oracle was in the news recently with the story of its license revenue declining as much as 17% in the recent quarter. This is blamed on transitioning to the cloud in some publications, but others, such as Bloomberg and TechRepublic, look deeper, seeing open source software responsible for the bulk of it.

Things are especially interesting in the MySQL ecosystem, as Oracle both owns its traditional “Enterprise” Oracle database and MySQL – a more modern open source database.

At Percona we see the same story repeating among many of our enterprise customers:

  1. MySQL proves itself. This generally happens one of two ways. One is for the enterprise using traditional enterprise databases, such as Oracle or DB2, to acquire a company which has been built on MySQL. After the dust settles the CFO or CIO discovers that the acquired company has been successfully running business-critical operations with MySQL and spending hundreds of thousands of dollars on database support instead of tens of millions. At this point it’s been shown that it can be done, so it should continue.

The other way is for MySQL to rise through the ranks in an organization. Typically it starts with some small MySQL use, such as running a bug tracking application in the IT department. Then it moves to MySQL being used with Drupal to power the main corporate website and an e-commerce function with Magento or something similar. Over time, MySQL proves itself and is trusted to handle more and more “core” enterprise databases that are absolutely critical for the business.

Interestingly enough, contrary to what some people have said, MySQL ownership by Oracle helps it to gain trust with many enterprise accounts. Enterprises may not like Oracle’s license and maintenance fees, but they like Oracle’s quality engineering, attention to security and predictable releases.

  1. New applications are built using MySQL. As the enterprise is ready to embrace MySQL it is added to the approved database list and now internal teams can use it to develop applications. In many cases the mandate goes even further with MySQL than with other open source technologies, as it is given preference, and teams need to really justify to management when they want to use Oracle or other proprietary database technologies. There are some cases when that may be warranted, but in most cases MySQL is good enough.

  1. Moving existing applications from Oracle to MySQL.  Depending on the organization and applications it can happen a couple of different ways. One is the equivalent applications are built from scratch on the new open source technology stack and the old application is retired. The other is only the database is migrated from Oracle to MySQL. Moving the database from Oracle to MySQL might be easy and might be close to a full application rewrite. For example, we see Java applications which often use the database as a simple data store through the ORM framework which can be moved to MySQL easily; on the other hand, applications built with extensive use of advanced stored procedures and Oracle-specific SQL extensions are much harder to move.

The wave of moving to open source database technologies will continue and we’re not alone in thinking that – Gartner believes that by 2018, 70% of new in-house applications will be built on open source database systems.

What are we currently seeing in the MySQL ecosystem? First, many customers tell us that they are looking at hefty price increases for MySQL support subscriptions. Some of the customers which had previously signed 5 year agreements with Sun (at the time it was acquired by Oracle) who are exploring renewing now, see price increases as much as 5x for a comparable environment. This is very understandable considering the pressures Oracle has on the market right now.

The issues, however, go deeper than the price. Many customers are not comfortable trusting Oracle to give them the best possible advice for moving from expensive Oracle to a much less expensive Oracle MySQL database. The conflicts are obvious when the highest financial reward comes to Oracle by proving applications can’t be moved to MySQL or any other open source database.

If you’re choosing MySQL, Oracle is financially interested in having you use the Enterprise Edition, which brings back many of the vendor lock-in issues enterprises are trying to avoid by moving to open source databases. Customers believe Oracle will ensure enterprise-only features are put in use in the applications, making it difficult to avoid renewing at escalating prices.

So what do our customers see in Percona which makes them prefer our support and other services to those of Oracle?

  • We are a great partner if you’re considering moving from the Oracle database to MySQL as we both have years of experience and no conflict of interest.
  • Percona Server, Percona XtraDB Cluster, Percona Xtrabackup and our other software for the MySQL ecosystem is 100% open source, which means we’re not trying to lock you into the “enterprise version” as we work together. Furthermore, many of the features which are only available in MySQL Enterprise Edition are available in the fully open source Percona Server, including audit, backup and authentication.
  • We are focused on solutions for your business, not pushing Percona-branded technology. If you choose to use Percona Server, great! If you are using MySQL, MariaDB, Amazon RDS, etc., that’s great too.

With the continuing trend of moving to open source database management systems the cost pressures on people running proprietary databases will continue to increase, and the only real solution is to accelerate moving to the open source stack. As you do that, you’re better off moving to completely open source technology, such as what is available from Percona, to avoid vendor lock-in. If you’re looking for the partner to help you to assess the migration strategy and execute the move successfully, check for conflicts of interests and ensure the interests of your and your provider are completely aligned.

The post Oracle license revenue and the MySQL ecosystem appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com