Jun
15
2020
--

VESoft raises $8M to meet China’s growing need for graph databases

Sherman Ye founded VESoft in 2018 when he saw a growing demand for graph databases in China. Its predecessors, like Neo4j and TigerGraph, had already been growing aggressively in the West for a few years, while China was just getting to know the technology that leverages graph structures to store data sets and depict their relationships, such as those used for social media analysis, e-commerce recommendations and financial risk management.

VESoft is ready for further growth after closing an $8 million funding round led by Redpoint China Ventures, an investment firm launched by Silicon Valley-based Redpoint Ventures in 2005. Existing investor Matrix Partners China also participated in the Series pre-A round. The new capital will allow the startup to develop products and expand to markets in North America, Europe and other parts of Asia.

The 30-people team is comprised of former employees from Alibaba, Facebook, Huawei and IBM. It’s based in Hangzhou, a scenic city known for its rich history and housing Alibaba and its financial affiliate Ant Financial, where Ye previously worked as a senior engineer after his four-year stint with Facebook in California. From 2017 to 2018, the entrepreneur noticed that Ant Financial’s customers were increasingly interested in adopting graph databases as an alternative to relational databases, a model that had been popular since the 80s and normally organizes data into tables.

“While relational databases are capable of achieving many functions carried out by graph databases… they deteriorate in performance as the quantity of data grows,” Ye told TechCrunch during an interview. “We didn’t use to have so much data.”

Information explosion is one reason why Chinese companies are turning to graph databases, which can handle millions of transactions to discover patterns within scattered data. The technology’s rise is also a response to new forms of online businesses that depend more on relationships.

“Take recommendations for example. The old model recommends content based purely on user profiles, but the problem of relying on personal browsing history is it fails to recommend new things. That was fine for a long time as the Chinese [internet] market was big enough to accommodate many players. But as the industry becomes saturated and crowded… companies need to ponder how to retain existing users, lengthen their time spent, and win users from rivals.”

The key lies in serving people content and products they find appealing. Graph databases come in handy, suggested Ye, when services try to predict users’ interest or behavior as the model uncovers what their friends or people within their social circles like. “That’s a lot more effective than feeding them what’s trending.”

Neo4j compares relational and graph databases (Link)

The company has made its software open source, which the founder believed can help cultivate a community of graph database users and educate the market in China. It will also allow VESoft to reach more engineers in the English-speaking world who are well-acquainted with the open-source culture.

“There is no such thing as being ‘international’ or ‘domestic’ for a technology-driven company. There are no boundaries between countries in the open-source world,” reckoned Ye.

When it comes to generating income, the startup plans to launch a paid version for enterprises, which will come with customized plug-ins and host services.

The Nebula Graph, the brand of VESoft’s database product, is now serving 20 enterprise clients from areas across social media, e-commerce and finance, including big names like food delivery giant Meituan, popular social commerce app Xiaohongshu and e-commerce leader JD.com. A number of overseas companies are also trialing Nebula.

The time is ripe for enterprise-facing startups with a technological moat in China as the market for consumers has been divided by incumbents like Tencent and Alibaba. This makes fundraising relatively easy for VESoft. The founder is confident that Chinese companies are rapidly catching up with their Western counterparts in the space, for the gargantuan amount of data and the myriad of ways data is used in the country “will propel the technology forward.”

May
12
2020
--

Microsoft partners with Redis Labs to improve its Azure Cache for Redis

For a few years now, Microsoft has offered Azure Cache for Redis, a fully managed caching solution built on top of the open-source Redis project. Today, it is expanding this service by adding Redis Enterprise, Redis Lab’s commercial offering, to its platform. It’s doing so in partnership with Redis Labs and while Microsoft will offer some basic support for the service, Redis Labs will handle most of the software support itself.

Julia Liuson, Microsoft’s corporate VP of its developer tools division, told me that the company wants to be seen as a partner to open-source companies like Redis Labs, which was among the first companies to change its license to prevent cloud vendors from commercializing and repackaging their free code without contributing back to the community. Last year, Redis Labs partnered with Google Cloud to bring its own fully managed service to its platform and so maybe it’s no surprise that we are now seeing Microsoft make a similar move.

Liuson tells me that with this new tier for Azure Cache for Redis, users will get a single bill and native Azure management, as well as the option to deploy natively on SSD flash storage. The native Azure integration should also make it easier for developers on Azure to integrate Redis Enterprise into their applications.

It’s also worth noting that Microsoft will support Redis Labs’ own Redis modules, including RediSearch, a Redis-powered search engine, as well as RedisBloom and RedisTimeSeries, which provide support for new datatypes in Redis.

“For years, developers have utilized the speed and throughput of Redis to produce unbeatable responsiveness and scale in their applications,” says Liuson. “We’ve seen tremendous adoption of Azure Cache for Redis, our managed solution built on open source Redis, as Azure customers have leveraged Redis performance as a distributed cache, session store, and message broker. The incorporation of the Redis Labs Redis Enterprise technology extends the range of use cases in which developers can utilize Redis, while providing enhanced operational resiliency and security.”

Mar
03
2020
--

Datastax acquires The Last Pickle

Data management company Datastax, one of the largest contributors to the Apache Cassandra project, today announced that it has acquired The Last Pickle (and no, I don’t know what’s up with that name either), a New Zealand-based Cassandra consulting and services firm that’s behind a number of popular open-source tools for the distributed NoSQL database.

As Datastax Chief Strategy Officer Sam Ramji, who you may remember from his recent tenure at Apigee, the Cloud Foundry Foundation, Google and Autodesk, told me, The Last Pickle is one of the premier Apache Cassandra consulting and services companies. The team there has been building Cassandra-based open source solutions for the likes of Spotify, T Mobile and AT&T since it was founded back in 2012. And while The Last Pickle is based in New Zealand, the company has engineers all over the world that do the heavy lifting and help these companies successfully implement the Cassandra database technology.

It’s worth mentioning that Last Pickle CEO Aaron Morton first discovered Cassandra when he worked for WETA Digital on the special effects for Avatar, where the team used Cassandra to allow the VFX artists to store their data.

“There’s two parts to what they do,” Ramji explained. “One is the very visible consulting, which has led them to become world experts in the operation of Cassandra. So as we automate Cassandra and as we improve the operability of the project with enterprises, their embodied wisdom about how to operate and scale Apache Cassandra is as good as it gets — the best in the world.” And The Last Pickle’s experience in building systems with tens of thousands of nodes — and the challenges that its customers face — is something Datastax can then offer to its customers as well.

And Datastax, of course, also plans to productize The Last Pickle’s open-source tools like the automated repair tool Reaper and the Medusa backup and restore system.

As both Ramji and Datastax VP of Engineering Josh McKenzie stressed, Cassandra has seen a lot of commercial development in recent years, with the likes of AWS now offering a managed Cassandra service, for example, but there wasn’t all that much hype around the project anymore. But they argue that’s a good thing. Now that it is over ten years old, Cassandra has been battle-hardened. For the last ten years, Ramji argues, the industry tried to figure out what the de factor standard for scale-out computing should be. By 2019, it became clear that Kubernetes was the answer to that.

“This next decade is about what is the de facto standard for scale-out data? We think that’s got certain affordances, certain structural needs and we think that the decades that Cassandra has spent getting harden puts it in a position to be data for that wave.”

McKenzie also noted that Cassandra provides users with a number of built-in features like support for mutiple data centers and geo-replication, rolling updates and live scaling, as well as wide support across programming languages, give it a number of advantages over competing databases.

“It’s easy to forget how much Cassandra gives you for free just based on its architecture,” he said. “Losing the power in an entire datacenter, upgrading the version of the database, hardware failing every day? No problem. The cluster is 100 percent always still up and available. The tooling and expertise of The Last Pickle really help bring all this distributed and resilient power into the hands of the masses.”

The two companies did not disclose the price of the acquisition.

Sep
11
2019
--

ScyllaDB takes on Amazon with new DynamoDB migration tool

There are a lot of open source databases out there, and ScyllaDB, a NoSQL variety, is looking to differentiate itself by attracting none other than Amazon users. Today, it announced a DynamoDB migration tool to help Amazon customers move to its product.

It’s a bold move, but Scylla, which has a free open source product along with paid versions, has always had a penchant for going after bigger players. It has had a tool to help move Cassandra users to ScyllaDB for some time.

CEO Dor Laor says DynamoDB customers can now also migrate existing code with little modification. “If you’re using DynamoDB today, you will still be using the same drivers and the same client code. In fact, you don’t need to modify your client code one bit. You just need to redirect access to a different IP address running Scylla,” Laor told TechCrunch.

He says that the reason customers would want to switch to Scylla is because it offers a faster and cheaper experience by utilizing the hardware more efficiently. That means companies can run the same workloads on fewer machines, and do it faster, which ultimately should translate to lower costs.

The company also announced a $25 million Series C extension led by Eight Roads Ventures. Existing investors Bessemer Venture Partners, Magma Venture Partners, Qualcomm Ventures and TLV Partners also participated. Scylla has raised a total of $60 million, according to the company.

The startup has been around for 6 years and customers include Comcast, GE, IBM and Samsung. Laor says that Comcast went from running Cassandra on 400 machines to running the same workloads with Scylla on just 60.

Laor is playing the long game in the database market, and it’s not about taking on Cassandra, DynamoDB or any other individual product. “Our main goal is to be the default NoSQL database where if someone has big data, real-time workloads, they’ll think about us first, and we will become the default.”

Jun
18
2019
--

MongoDB gets a data lake, new security features and more

MongoDB is hosting its developer conference today and, unsurprisingly, the company has quite a few announcements to make. Some are straightforward, like the launch of MongoDB 4.2 with some important new security features, while others, like the launch of the company’s Atlas Data Lake, point the company beyond its core database product.

“Our new offerings radically expand the ways developers can use MongoDB to better work with data,” said Dev Ittycheria, the CEO and president of MongoDB. “We strive to help developers be more productive and remove infrastructure headaches — with additional features along with adjunct capabilities like full-text search and data lake. IDC predicts that by 2025 global data will reach 175 Zettabytes and 49% of it will reside in the public cloud. It’s our mission to give developers better ways to work with data wherever it resides, including in public and private clouds.”

The highlight of today’s set of announcements is probably the launch of MongoDB Atlas Data Lake. Atlas Data Lake allows users to query data, using the MongoDB Query Language, on AWS S3, no matter their format, including JSON, BSON, CSV, TSV, Parquet and Avro. To get started, users only need to point the service at their existing S3 buckets. They don’t have to manage servers or other infrastructure. Support for Data Lake on Google Cloud Storage and Azure Storage is in the works and will launch in the future.

Also new is Full-Text Search, which gives users access to advanced text search features based on the open-source Apache Lucene 8.

In addition, MongoDB is also now starting to bring together Realm, the mobile database product it acquired earlier this year, and the rest of its product lineup. Using the Realm brand, Mongo is merging its serverless platform, MongoDB Stitch, and Realm’s mobile database and synchronization platform. Realm’s synchronization protocol will now connect to MongoDB Atlas’ cloud database, while Realm Sync will allow developers to bring this data to their applications. 

“By combining Realm’s wildly popular mobile database and synchronization platform with the strengths of Stitch, we will eliminate a lot of work for developers by making it natural and easy to work with data at every layer of the stack, and to seamlessly move data between devices at the edge to the core backend,”  explained Eliot Horowitz, CTO and co-founder of MongoDB.

As for the latest release of MongoDB, the highlight of the release is a set of new security features. With this release, Mongo is implementing client-side Field Level Encryption. Traditionally, database security has always relied on server-side trust. This typically leaves the data accessible to administrators, even if they don’t have client access. If an attacker breaches the server, that’s almost automatically a catastrophic event.

With this new security model, Mongo is shifting access to the client and to the local drivers. It provides multiple encryption options; for developers to make use of this, they will use a new “encrypt” JSON scheme attribute.

This ensures that all application code can generally run unmodified, and even the admins won’t get access to the database or its logs and backups unless they get client access rights themselves. Because the logic resides in the drivers, the encryption is also handled totally separate from the actual database.

Other new features in MongoDB 4.2 include support for distributed transactions and the ability to manage MongoDB deployments from a single Kubernetes control plane.

May
02
2019
--

Couchbase’s mobile database gets built-in ML and enhanced synchronization features

Couchbase, the company behind the eponymous NoSQL database, announced a major update to its mobile database today that brings some machine learning smarts, as well as improved synchronization features and enhanced stats and logging support, to the software.

“We’ve led the innovation and data management at the edge since the release of our mobile database five years ago,” Couchbase’s VP of Engineering Wayne Carter told me. “And we’re excited that others are doing that now. We feel that it’s very, very important for businesses to be able to utilize these emerging technologies that do sit on the edge to drive their businesses forward, and both making their employees more effective and their customer experience better.”

The latter part is what drove a lot of today’s updates, Carter noted. He also believes that the database is the right place to do some machine learning. So with this release, the company is adding predictive queries to its mobile database. This new API allows mobile apps to take pre-trained machine learning models and run predictive queries against the data that is stored locally. This would allow a retailer to create a tool that can use a phone’s camera to figure out what part a customer is looking for.

To support these predictive queries, Couchbase mobile is also getting support for predictive indexes. “Predictive indexes allow you to create an index on prediction, enabling correlation of real-time predictions with application data in milliseconds,” Carter said. In many ways, that’s also the unique value proposition for bringing machine learning into the database. “What you really need to do is you need to utilize the unique values of a database to be able to deliver the answer to those real-time questions within milliseconds,” explained Carter.

The other major new feature in this release is delta synchronization, which allows businesses to push far smaller updates to the databases on their employees’ mobile devices. That’s because they only have to receive the information that changed instead of a full updated database. Carter says this was a highly requested feature, but until now, the company always had to prioritize work on other components of Couchbase.

This is an especially useful feature for the company’s retail customers, a vertical where it has been quite successful. These users need to keep their catalogs up to data and quite a few of them supply their employees with mobile devices to help shoppers. Rumor has it that Apple, too, is a Couchbase user.

The update also includes a few new features that will be more of interest to operators, including advanced stats reporting and enhanced logging support.

Apr
09
2019
--

Google Cloud challenges AWS with new open-source integrations

Google today announced that it has partnered with a number of top open-source data management and analytics companies to integrate their products into its Google Cloud Platform and offer them as managed services operated by its partners. The partners here are Confluent, DataStax, Elastic, InfluxData, MongoDB, Neo4j and Redis Labs.

The idea here, Google says, is to provide users with a seamless user experience and the ability to easily leverage these open-source technologies in Google’s cloud. But there is a lot more at play here, even though Google never quite says so. That’s because Google’s move here is clearly meant to contrast its approach to open-source ecosystems with Amazon’s. It’s no secret that Amazon’s AWS cloud computing platform has a reputation for taking some of the best open-source projects and then forking those and packaging them up under its own brand, often without giving back to the original project. There are some signs that this is changing, but a number of companies have recently taken action and changed their open-source licenses to explicitly prevent this from happening.

That’s where things get interesting, because those companies include Confluent, Elastic, MongoDB, Neo4j and Redis Labs — and those are all partnering with Google on this new project, though it’s worth noting that InfluxData is not taking this new licensing approach and that while DataStax uses lots of open-source technologies, its focus is very much on its enterprise edition.

“As you are aware, there has been a lot of debate in the industry about the best way of delivering these open-source technologies as services in the cloud,” Manvinder Singh, the head of infrastructure partnerships at Google Cloud, said in a press briefing. “Given Google’s DNA and the belief that we have in the open-source model, which is demonstrated by projects like Kubernetes, TensorFlow, Go and so forth, we believe the right way to solve this it to work closely together with companies that have invested their resources in developing these open-source technologies.”

So while AWS takes these projects and then makes them its own, Google has decided to partner with these companies. While Google and its partners declined to comment on the financial arrangements behind these deals, chances are we’re talking about some degree of profit-sharing here.

“Each of the major cloud players is trying to differentiate what it brings to the table for customers, and while we have a strong partnership with Microsoft and Amazon, it’s nice to see that Google has chosen to deepen its partnership with Atlas instead of launching an imitation service,” Sahir Azam, the senior VP of Cloud Products at MongoDB told me. “MongoDB and GCP have been working closely together for years, dating back to the development of Atlas on GCP in early 2017. Over the past two years running Atlas on GCP, our joint teams have developed a strong working relationship and support model for supporting our customers’ mission critical applications.”

As for the actual functionality, the core principle here is that Google will deeply integrate these services into its Cloud Console; for example, similar to what Microsoft did with Databricks on Azure. These will be managed services and Google Cloud will handle the invoicing and the billings will count toward a user’s Google Cloud spending commitments. Support will also run through Google, so users can use a single service to manage and log tickets across all of these services.

Redis Labs CEO and co-founder Ofer Bengal echoed this. “Through this partnership, Redis Labs and Google Cloud are bringing these innovations to enterprise customers, while giving them the choice of where to run their workloads in the cloud, he said. “Customers now have the flexibility to develop applications with Redis Enterprise using the fully integrated managed services on GCP. This will include the ability to manage Redis Enterprise from the GCP console, provisioning, billing, support, and other deep integrations with GCP.”

Jan
31
2019
--

Google’s Cloud Firestore NoSQL database hits general availability

Google today announced that Cloud Firestore, its serverless NoSQL document database for mobile, web and IoT apps, is now generally available. In addition, Google is also introducing a few new features and bringing the service to 10 new regions.

With this launch, Google is giving developers the option to run their databases in a single region. During the beta, developers had to use multi-region instances, and, while that obviously has some advantages with regard to resilience, it’s also more expensive and not every app needs to run in multiple regions.

“Some people don’t need the added reliability and durability of a multi-region application,” Google product manager Dan McGrath told me. “So for them, having a more cost-effective regional instance is very attractive, as well as data locality and being able to place a Cloud Firestore database as close as possible to their user base.”

The new regional instance pricing is up to 50 percent cheaper than the current multi-cloud instance prices. Which solution you pick does influence the SLA guarantee Google gives you, though. While the regional instances are still replicated within multiple zones inside the region, all of the data is still within a limited geographic area. Hence, Google promises 99.999 percent availability for multi-region instances and 99.99 percent availability for regional instances.

And talking about regions, Cloud Firestore is now available in 10 new regions around the world. Firestore launched with a single location when it launched and added two more during the beta. With this, Firestore is now available in 13 locations (including the North America and Europe multi-region offerings). McGrath tells me Google is still in the planning stage for deciding the next phase of locations, but he stressed that the current set provides pretty good coverage across the globe.

Also new in this release is deeper integration with Stackdriver, the Google Cloud monitoring service, which can now monitor read, write and delete operations in near-real time. McGrath also noted that Google plans to add the ability to query documents across collections and increment database values without needing a transaction.

It’s worth noting that while Cloud Firestore falls under the Google Firebase brand, which typically focuses on mobile developers, Firestore offers all of the usual client-side libraries for Compute Engine or Kubernetes Engine applications, too.

“If you’re looking for a more traditional NoSQL document database, then Cloud Firestore gives you a great solution that has all the benefits of not needing to manage the database at all,” McGrath said. “And then, through the Firebase SDK, you can use it as a more comprehensive back-end as a service that takes care of things like authentication for you.”

One of the advantages of Firestore is that it has extensive offline support, which makes it ideal for mobile developers but also IoT solutions. Maybe it’s no surprise, then, that Google is positioning it as a tool for both Google Cloud and Firebase users.

Jan
24
2019
--

Microsoft acquires Citus Data

Microsoft today announced that it has acquired Citus Data, a company that focused on making PostgreSQL databases faster and more scalable. Citus’ open-source PostgreSQL extension essentially turns the application into a distributed database and, while there has been a lot of hype around the NoSQL movement and document stores, relational databases — and especially PostgreSQL — are still a growing market, in part because of tools from companies like Citus that overcome some of their earlier limitations.

Unsurprisingly, Microsoft plans to work with the Citus Data team to “accelerate the delivery of key, enterprise-ready features from Azure to PostgreSQL and enable critical PostgreSQL workloads to run on Azure with confidence.” The Citus co-founders echo this in their own statement, noting that “as part of Microsoft, we will stay focused on building an amazing database on top of PostgreSQL that gives our users the game-changing scale, performance, and resilience they need. We will continue to drive innovation in this space.”

PostgreSQL is obviously an open-source tool, and while the fact that Microsoft is now a major open-source contributor doesn’t come as a surprise anymore, it’s worth noting that the company stresses that it will continue to work with the PostgreSQL community. In an email, a Microsoft spokesperson also noted that “the acquisition is a proof point in the company’s commitment to open source and accelerating Azure PostgreSQL performance and scale.”

Current Citus customers include the likes of real-time analytics service Chartbeat, email security service Agari and PushOwl, though the company notes that it also counts a number of Fortune 100 companies among its users (they tend to stay anonymous). The company offers both a database as a service, an on-premises enterprise version and the free open-source edition. For the time being, it seems like that’s not changing, though over time I would suspect that Microsoft will transition users of the hosted service to Azure.

The price of the acquisition was not disclosed. Citus Data, which was founded in 2010 and graduated from the Y Combinator program, previously raised more than $13 million from the likes of Khosla Ventures, SV Angel and Data Collective.

Sep
24
2018
--

Microsoft updates its planet-scale Cosmos DB database service

Cosmos DB is undoubtedly one of the most interesting products in Microsoft’s Azure portfolio. It’s a fully managed, globally distributed multi-model database that offers throughput guarantees, a number of different consistency models and high read and write availability guarantees. Now that’s a mouthful, but basically, it means that developers can build a truly global product, write database updates to Cosmos DB and rest assured that every other user across the world will see those updates within 20 milliseconds or so. And to write their applications, they can pretend that Cosmos DB is a SQL- or MongoDB-compatible database, for example.

CosmosDB officially launched in May 2017, though in many ways it’s an evolution of Microsoft’s existing Document DB product, which was far less flexible. Today, a lot of Microsoft’s own products run on CosmosDB, including the Azure Portal itself, as well as Skype, Office 365 and Xbox.

Today, Microsoft is extending Cosmos DB with the launch of its multi-master replication feature into general availability, as well as support for the Cassandra API, giving developers yet another option to bring existing products to CosmosDB, which in this case are those written for Cassandra.

Microsoft now also promises 99.999 percent read and write availability. Previously, it’s read availability promise was 99.99 percent. And while that may not seem like a big difference, it does show that after more of a year of operating Cosmos DB with customers, Microsoft now feels more confident that it’s a highly stable system. In addition, Microsoft is also updating its write latency SLA and now promises less than 10 milliseconds at the 99th percentile.

“If you have write-heavy workloads, spanning multiple geos, and you need this near real-time ingest of your data, this becomes extremely attractive for IoT, web, mobile gaming scenarios,” Microsoft CosmosDB architect and product manager Rimma Nehme told me. She also stressed that she believes Microsoft’s SLA definitions are far more stringent than those of its competitors.

The highlight of the update, though, is multi-master replication. “We believe that we’re really the first operational database out there in the marketplace that runs on such a scale and will enable globally scalable multi-master available to the customers,” Nehme said. “The underlying protocols were designed to be multi-master from the very beginning.”

Why is this such a big deal? With this, developers can designate every region they run Cosmos DB in as a master in its own right, making for a far more scalable system in terms of being able to write updates to the database. There’s no need to first write to a single master node, which may be far away, and then have that node push the update to every other region. Instead, applications can write to the nearest region, and Cosmos DB handles everything from there. If there are conflicts, the user can decide how those should be resolved based on their own needs.

Nehme noted that all of this still plays well with CosmosDB’s existing set of consistency models. If you don’t spend your days thinking about database consistency models, then this may sound arcane, but there’s a whole area of computer science that focuses on little else but how to best handle a scenario where two users virtually simultaneously try to change the same cell in a distributed database.

Unlike other databases, Cosmos DB allows for a variety of consistency models, ranging from strong to eventual, with three intermediary models. And it actually turns out that most CosmosDB users opt for one of those intermediary models.

Interestingly, when I talked to Leslie Lamport, the Turing award winner who developed some of the fundamental concepts behind these consistency models (and the popular LaTeX document preparation system), he wasn’t all that sure that the developers are making the right choice. “I don’t know whether they really understand the consequences or whether their customers are going to be in for some surprises,” he told me. “If they’re smart, they are getting just the amount of consistency that they need. If they’re not smart, it means they’re trying to gain some efficiency and their users might not be happy about that.” He noted that when you give up strong consistency, it’s often hard to understand what exactly is happening.

But strong consistency comes with its drawbacks, too, which leads to higher latency. “For strong consistency there are a certain number of roundtrip message delays that you can’t avoid,” Lamport noted.

The CosmosDB team isn’t just building on some of the fundamental work Lamport did around databases, but it’s also making extensive use of TLA+, the formal specification language Lamport developed in the late 90s. Microsoft, as well as Amazon and others, are now training their engineers to use TLA+ to describe their algorithms mathematically before they implement them in whatever language they prefer.

“Because [CosmosDB is] a massively complicated system, there is no way to ensure the correctness of it because we are humans, and trying to hold all of these failure conditions and the complexity in any one person’s — one engineer’s — head, is impossible,” Microsoft Technical Follow Dharma Shukla noted. “TLA+ is huge in terms of getting the design done correctly, specified and validated using the TLA+ tools even before a single line of code is written. You cover all of those hundreds of thousands of edge cases that can potentially lead to data loss or availability loss, or race conditions that you had never thought about, but that two or three years ago after you have deployed the code can lead to some data corruption for customers. That would be disastrous.”

“Programming languages have a very precise goal, which is to be able to write code. And the thing that I’ve been saying over and over again is that programming is more than just coding,” Lamport added. “It’s not just coding, that’s the easy part of programming. The hard part of programming is getting the algorithms right.”

Lamport also noted that he deliberately chose to make TLA+ look like mathematics, not like another programming languages. “It really forces people to think above the code level,” Lamport noted and added that engineers often tell him that it changes the way they think.

As for those companies that don’t use TLA+ or a similar methodology, Lamport says he’s worried. “I’m really comforted that [Microsoft] is using TLA+ because I don’t see how anyone could do it without using that kind of mathematical thinking — and I worry about what the other systems that we wind up using built by other organizations — I worry about how reliable they are.”

more Microsoft Ignite 2018 coverage

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com