Dec
13
2018
--

They scaled YouTube — now they’ll shard everyone with PlanetScale

When the former CTOs of YouTube, Facebook and Dropbox seed fund a database startup, you know there’s something special going on under the hood. Jiten Vaidya and Sugu Sougoumarane saved YouTube from a scalability nightmare by inventing and open-sourcing Vitess, a brilliant relational data storage system. But in the decade since working there, the pair have been inundated with requests from tech companies desperate for help building the operational scaffolding needed to actually integrate Vitess.

So today the pair are revealing their new startup PlanetScale that makes it easy to build multi-cloud databases that handle enormous amounts of information without locking customers into Amazon, Google or Microsoft’s infrastructure. Battle-tested at YouTube, the technology could allow startups to fret less about their backend and focus more on their unique value proposition. “Now they don’t have to reinvent the wheel” Vaidya tells me. “A lot of companies facing this scaling problem end up solving it badly in-house and now there’s a way to solve that problem by using us to help.”

PlanetScale quietly raised a $3 million seed round in April, led by SignalFire and joined by a who’s who of engineering luminaries. They include YouTube co-founder and CTO Steve Chen, Quora CEO and former Facebook CTO Adam D’Angelo, former Dropbox CTO Aditya Agarwal, PayPal and Affirm co-founder Max Levchin, MuleSoft co-founder and CTO Ross Mason, Google director of engineering Parisa Tabriz and Facebook’s first female engineer and South Park Commons founder Ruchi Sanghvi. If anyone could foresee the need for Vitess implementation services, it’s these leaders, who’ve dealt with scaling headaches at tech’s top companies.

But how can a scrappy startup challenge the tech juggernauts for cloud supremacy? First, by actually working with them. The PlanetScale beta that’s now launching lets companies spin up Vitess clusters on its database-as-a-service, their own through a licensing deal, or on AWS with Google Cloud and Microsoft Azure coming shortly. Once these integrations with the tech giants are established, PlanetScale clients can use it as an interface for a multi-cloud setup where they could keep their data master copies on AWS US-West with replicas on Google Cloud in Ireland and elsewhere. That protects companies from becoming dependent on one provider and then getting stuck with price hikes or service problems.

PlanetScale also promises to uphold the principles that undergirded Vitess. “It’s our value that we will keep everything in the query pack completely open source so none of our customers ever have to worry about lock-in” Vaidya says.

PlanetScale co-founders (from left): Jiten Vaidya and Sugu Sougoumarane

Battle-tested, YouTube-approved

He and Sougoumarane met 25 years ago while at Indian Institute of Technology Bombay. Back in 1993 they worked at pioneering database company Informix together before it flamed out. Sougoumarane was eventually hired by Elon Musk as an early engineer for X.com before it got acquired by PayPal, and then left for YouTube. Vaidya was working at Google and the pair were reunited when it bought YouTube and Sougoumarane pulled him on to the team.

“YouTube was growing really quickly and the relationship database they were using with MySQL was sort of falling apart at the seams,” Vaidya recalls. Adding more CPU and memory to the database infra wasn’t cutting it, so the team created Vitess. The horizontal scaling sharding middleware for MySQL let users segment their database to reduce memory usage while still being able to rapidly run operations. YouTube has smoothly ridden that infrastructure to 1.8 billion users ever since.

“Sugu and Mike Solomon invented and made Vitess open source right from the beginning since 2010 because they knew the scaling problem wasn’t just for YouTube, and they’ll be at other companies five or 10 years later trying to solve the same problem,” Vaidya explains. That proved true, and now top apps like Square and HubSpot run entirely on Vitess, with Slack now 30 percent onboard.

Vaidya left YouTube in 2012 and became the lead engineer at Endorse, which got acquired by Dropbox, where he worked for four years. But in the meantime, the engineering community strayed toward MongoDB-style non-relational databases, which Vaidya considers inferior. He sees indexing issues and says that if the system hiccups during an operation, data can become inconsistent — a big problem for banking and commerce apps. “We think horizontally scaled relationship databases are more elegant and are something enterprises really need.

Database legends reunite

Fed up with the engineering heresy, a year ago Vaidya committed to creating PlanetScale. It’s composed of four core offerings: professional training in Vitess, on-demand support for open-source Vitess users, Vitess database-as-a-service on PlanetScale’s servers and software licensing for clients that want to run Vitess on premises or through other cloud providers. It lets companies re-shard their databases on the fly to relocate user data to comply with regulations like GDPR, safely migrate from other systems without major codebase changes, make on-demand changes and run on Kubernetes.

The PlanetScale team

PlanetScale’s customers now include Indonesian e-commerce giant Bukalapak, and it’s helping Booking.com, GitHub and New Relic migrate to open-source Vitess. Growth is suddenly ramping up due to inbound inquiries. Last month around when Square Cash became the No. 1 app, its engineering team published a blog post extolling the virtues of Vitess. Now everyone’s seeking help with Vitess sharding, and PlanetScale is waiting with open arms. “Jiten and Sugu are legends and know firsthand what companies require to be successful in this booming data landscape,” says Ilya Kirnos, founding partner and CTO of SignalFire.

The big cloud providers are trying to adapt to the relational database trend, with Google’s Cloud Spanner and Cloud SQL, and Amazon’s AWS SQL and AWS Aurora. Their huge networks and marketing war chests could pose a threat. But Vaidya insists that while it might be easy to get data into these systems, it can be a pain to get it out. PlanetScale is designed to give them freedom of optionality through its multi-cloud functionality so their eggs aren’t all in one basket.

Finding product market fit is tough enough. Trying to suddenly scale a popular app while also dealing with all the other challenges of growing a company can drive founders crazy. But if it’s good enough for YouTube, startups can trust PlanetScale to make databases one less thing they have to worry about.

Nov
28
2018
--

AWS tries to lure Windows users with Amazon FSx for Windows File Server

Amazon has had storage options for Linux file servers for some time, but it recognizes that a number of companies still use Windows file servers, and they are not content to cede that market to Microsoft. Today the company announced Amazon FSx for Windows File Server to provide a fully compatible Windows option.

“You get a native Windows file system backed by fully-managed Windows file servers, accessible via the widely adopted SMB (Server Message Block) protocol. Built on SSD storage, Amazon FSx for Windows File Server delivers the throughput, IOPS, and consistent sub-millisecond performance that you (and your Windows applications) expect,” AWS’s Jeff Barr wrote in a blog post introducing the new feature.

That means if you use this service, you have a first-class Windows system with all of the compatibility with Windows services that you would expect, such as Active Directory and Windows Explorer.

AWS CEO Andy Jassy introduced the new feature today at AWS re:Invent, the company’s customer conference going on in Las Vegas this week. He said that even though Windows File Server usage is diminishing as more IT pros turn to Linux, there are still a fair number of customers who want a Windows-compatible system and they wanted to provide a service for them to move their Windows files to the cloud.

Of course, it doesn’t hurt that it provides a path for Microsoft customers to use AWS instead of turning to Azure for these workloads. Companies undertaking a multi-cloud strategy should like having a fully compatible option.

more AWS re:Invent 2018 coverage

Oct
25
2018
--

Dropbox expands Paper into planning tool with timelines

Dropbox has been building out Paper, its document-driven collaboration tool since it was first announced in 2015, slowly but surely layering on more functionality. Today, it added a timeline feature, pushing beyond collaboration into a light-weight project planning tool.

Dropbox has been hearing that customers really need a way to plan with Paper that was lacking. “That pain—the pain of coordinating all those moving pieces—is one we’re taking on today with our new timelines feature in Dropbox Paper,” the company wrote in a blog post announcing the new feature.

As you would expect with such a tool, it enables you to build a timeline with milestones, but being built into Paper, you can assign team members to each milestone and add notes with additional information including links to related documents.

You can also embed a To-do lists for the person assigned to a task right in the timeline to help them complete the given task, giving a single point of access for all the people assigned to a project

Gif: Dropbox

“Features like to-dos, @mentions, and due dates give team members easy ways to coordinate projects with each other. Timelines take these capabilities one step further, letting any team member create a clean visual representation of what’s happening when—and who’s responsible,” Dropbox wrote in the blog post announcement.

Dropbox has recognized it cannot live as simply a content storage tool. It needs to expand beyond that into collaboration and coordination around that content, and that’s what Dropbox Paper has been about. By adding timelines, the company is looking to expand that capability even further.

Alan Lepofsky, who covers the “future of work” for Constellation Research sees Paper as part of the changing face of collaboration tools. “I refer to the new breed of content creation tools as digital canvases. These apps simplify the user experience of integrating content from multiple sources. They are evolving the word-processor paradigm,” Lepofsky told TechCrunch.

It’s probably not going to replace a project manager’s full-blown planning tools any time soon, but it at least the potential to be a useful adjunct for the Paper arsenal to allow customers to continue to find ways to extract value from the content they store in Dropbox.

Oct
10
2018
--

Google introduces dual-region storage buckets to simplify data redundancy

Google is playing catch-up in the cloud, and as such it wants to provide flexibility to differentiate itself from AWS and Microsoft. Today, the company announced a couple of new options to help separate it from the cloud storage pack.

Storage may seem stodgy, but it’s a primary building block for many cloud applications. Before you can build an application you need the data that will drive it, and that’s where the storage component comes into play.

One of the issues companies have as they move data to the cloud is making sure it stays close to the application when it’s needed to reduce latency. Customers also require redundancy in the event of a catastrophic failure, but still need access with low latency. The latter has been a hard problem to solve until today when Google introduced a new dual-regional storage option.

As Google described it in the blog post announcing the new feature, “With this new option, you write to a single dual-regional bucket without having to manually copy data between primary and secondary locations. No replication tool is needed to do this and there are no network charges associated with replicating the data, which means less overhead for you storage administrators out there. In the event of a region failure, we transparently handle the failover and ensure continuity for your users and applications accessing data in Cloud Storage.”

This allows companies to have redundancy with low latency, while controlling where it goes without having to manually move it should the need arise.

Knowing what you’re paying

Companies don’t always require instant access to data, and Google (and other cloud vendors) offer a variety of storage options, making it cheaper to store and retrieve archived data. As of today, Google is offering a clear way to determine costs, based on customer storage choice types. While it might not seem revolutionary to let customers know what they are paying, Dominic Preuss, Google’s director of product management says it hasn’t always been a simple matter to calculate these kinds of costs in the cloud. Google decided to simplify it by clearly outlining the costs for medium (Nearline) and long-term (Coldline) storage across multiple regions.

As Google describes it, “With multi-regional Nearline and Coldline storage, you can access your data with millisecond latency, it’s distributed redundantly across a multi-region (U.S., EU or Asia), and you pay archival prices. This is helpful when you have data that won’t be accessed very often, but still needs to be protected with geographically dispersed copies, like media archives or regulated content. It also simplifies management.”

Under the new plan, you can select the type of storage you need, the kind of regional coverage you want and you can see exactly what you are paying.

Google Cloud storage pricing options. Chart: Google

Each of these new storage services has been designed to provide additional options for Google Cloud customers, giving them more transparency around pricing and flexibility and control over storage types, regions and the way they deal with redundancy across data stores.

Oct
10
2018
--

Egnyte hauls in $75M investment led by Goldman Sachs

Egnyte launched in 2007 just two years after Box, but unlike its enterprise counterpart, which went all-cloud and raised hundreds of millions of dollars, Egnyte saw a different path with a slow and steady growth strategy and a hybrid niche, recognizing that companies were going to keep some content in the cloud and some on prem. Up until today it had raised a rather modest $62.5 million, and hadn’t taken a dime since 2013, but that all changed when the company announced a whopping $75 million investment.

The entire round came from a single investor, Goldman Sachs’ Private Capital Investing arm, a part of Goldman’s Special Situations group. Holger Staude, vice president of Goldman Sachs Private Capital Investing will join Egnyte’s board under the terms of the deal. He says Goldman liked what it saw, a steady company poised for bigger growth with the right influx of capital. In fact, the company has had more than eight straight quarters of growth and have been cash flow positive since Q4 in 2016.

“We were impressed by the strong management team and the company’s fiscal discipline, having grown their top line rapidly without requiring significant outside capital for the past several years. They have created a strong business model that we believe can be replicated with success at a much larger scale,” Staude explained.

Company CEO Vineet Jain helped start the company as a way to store and share files in a business context, but over the years, he has built that into a platform that includes security and governance components. Jain also saw a market poised for growth with companies moving increasing amounts of data to the cloud. He felt the time was right to take on more significant outside investment. He said his first step was to build a list of investors, but Goldman shined through, he said.

“Goldman had reached out to us before we even started the fundraising process. There was inbound interest. They were more aggressive compared to others. Given there was prior conversations, the path to closing was shorter,” he said.

He wouldn’t discuss a specific valuation, but did say they have grown 6x since the 2013 round and he got what he described as “a decent valuation.” As for an IPO, he predicted this would be the final round before the company eventually goes public. “This is our last fund raise. At this level of funding, we have more than enough funding to support a growth trajectory to IPO,” he said.

Philosophically, Jain has always believed that it wasn’t necessary to hit the gas until he felt the market was really there. “I started off from a point of view to say, keep building a phenomenal product. Keep focusing on a post sales experience, which is phenomenal to the end user. Everything else will happen. So this is where we are,” he said.

Jain indicated the round isn’t about taking on money for money’s sake. He believes that this is going to fuel a huge growth stage for the company. He doesn’t plan to focus these new resources strictly on the sales and marketing department, as you might expect. He wants to scale every department in the company including engineering, posts-sales and customer success.

Today the company has 450 employees and more than 14,000 customers across a range of sizes and sectors including Nasdaq, Thoma Bravo, AppDynamics and Red Bull. The deal closed at the end of last month.

Sep
27
2018
--

Dropbox overhauls internal search to improve speed and accuracy

Over the last several months, Dropbox has been undertaking an overhaul of its internal search engine for the first time since 2015. Today, the company announced that the new version, dubbed Nautilus, is ready for the world. The latest search tool takes advantage of a new architecture powered by machine learning to help pinpoint the exact piece of content a user is looking for.

While an individual user may have a much smaller body of documents to search across than the World Wide Web, the paradox of enterprise search says that the fewer documents you have, the harder it is to locate the correct one. Yet Dropbox faces of a host of additional challenges when it comes to search. It has more than 500 million users and hundreds of billions of documents, making finding the correct piece for a particular user even more difficult. The company had to take all of this into consideration when it was rebuilding its internal search engine.

One way for the search team to attack a problem of this scale was to put machine learning to bear on it, but it required more than an underlying level of intelligence to make this work. It also required completely rethinking the entire search tool from an architectural level.

That meant separating two main pieces of the system, indexing and serving. The indexing piece is crucial of course in any search engine. A system of this size and scope needs a fast indexing engine to cover the number of documents in a whirl of changing content. This is the piece that’s hidden behind the scenes. The serving side of the equation is what end users see when they query the search engine, and the system generates a set of results.

Nautilus Architecture Diagram: Dropbox

Dropbox described the indexing system in a blog post announcing the new search engine: “The role of the indexing pipeline is to process file and user activity, extract content and metadata out of it, and create a search index.” They added that the easiest way to index a corpus of documents would be to just keep checking and iterating, but that couldn’t keep up with a system this large and complex, especially one that is focused on a unique set of content for each user (or group of users in the business tool).

They account for that in a couple of ways. They create offline builds every few days, but they also watch as users interact with their content and try to learn from that. As that happens, Dropbox creates what it calls “index mutations,” which they merge with the running indexes from the offline builds to help provide ever more accurate results.

The indexing process has to take into account the textual content assuming it’s a document, but it also has to look at the underlying metadata as a clue to the content. They use this information to feed a retrieval engine, whose job is to find as many documents as it can, as fast it can and worry about accuracy later.

It has to make sure it checks all of the repositories. For instance, Dropbox Paper is a separate repository, so the answer could be found there. It also has to take into account the access-level security, only displaying content that the person querying has the right to access.

Once it has a set of possible results, it uses machine learning to pinpoint the correct content. “The ranking engine is powered by a [machine learning] model that outputs a score for each document based on a variety of signals. Some signals measure the relevance of the document to the query (e.g., BM25), while others measure the relevance of the document to the user at the current moment in time,” they explained in the blog post.

After the system has a list of potential candidates, it ranks them and displays the results for the end user in the search interface, but a lot of work goes into that from the moment the user types the query until it displays a set of potential files. This new system is designed to make that process as fast and accurate as possible.

Sep
24
2018
--

Microsoft Azure gets new high-performance storage options

Microsoft Azure is getting a number of new storage options today that mostly focus on use cases where disk performance matters.

The first of these is Azure Ultra SSD Managed Disks, which are now in public preview. Microsoft says that these drives will offer “sub-millisecond latency,” which unsurprisingly makes them ideal for workloads where latency matters.

Earlier this year, Microsoft launched its Premium and Standard SSD Managed Disks offerings for Azure into preview. These ‘ultra’ SSDs represent the next tier up from the Premium SSDs with even lower latency and higher throughput. They’ll offer 160,000 IOPS per second will less than a millisecond of read/write latency. These disks will come in sizes ranging from 4GB to 64TB.

And talking about Standard SSD Managed Disks, this service is now generally available after only three months in preview. To top things off, all of Azure’s storage tiers (Premium and Standard SSD, as well as Standard HDD) now offer 8, 16 and 32 TB storage capacity.

Also new today is Azure Premium files, which is now in preview. This, too, is an SSD-based service. Azure Files itself isn’t new, though. It offers users access to cloud storage using the standard SMB protocol. This new premium offering promises higher throughput and lower latency for these kind of SMB operations.

more Microsoft Ignite 2018 coverage

Sep
24
2018
--

Microsoft wants to put your data in a box

AWS has its Snowball (and Snowmobile truck), Google Cloud has its data transfer appliance and Microsoft has its Azure Data Box. All of these are physical appliances that allow enterprises to ship lots of data to the cloud by uploading it into these machines and then shipping them to the cloud. Microsoft’s Azure Data Box launched into preview about a year ago and today, the company is announcing a number of updates and adding a few new boxes, too.

First of all, the standard 50-pound, 100-terabyte Data Box is now generally available. If you’ve got a lot of data to transfer to the cloud — or maybe collect a lot of offline data — then FedEx will happily pick this one up and Microsoft will upload the data to Azure and charge you for your storage allotment.

If you’ve got a lot more data, though, then Microsoft now also offers the Azure Data Box Heavy. This new box, which is now in preview, can hold up to one petabyte of data. Microsoft did not say how heavy the Data Box Heavy is, though.

Also new is the Azure Data Box Edge, which is now also in preview. In many ways, this is the most interesting of the additions since it goes well beyond transporting data. As the name implies, Data Box Edge is meant for edge deployments where a company collects data. What makes this version stand out is that it’s basically a small data center rack that lets you process data as it comes in. It even includes an FPGA to run AI algorithms at the edge.

Using this box, enterprises can collect the data, transform and analyze it on the box, and then send it to Azure over the network (and not in a truck). Using this, users can cut back on bandwidth cost and don’t have to send all of their data to the cloud for processing.

Also part of the same Data Box family is the Data Box Gateway. This is a virtual appliance, however, that runs on Hyper-V and VMWare and lets users create a data transfer gateway for importing data in Azure. That’s not quite as interesting as a hardware appliance but useful nonetheless.

more Microsoft Ignite 2018 coverage

Sep
03
2018
--

Dropbox drops some enhancements to Paper collaboration layer

When you’re primarily a storage company with enterprise aspirations, as Dropbox is, you need a layer to to help people use the content in your system beyond simple file sharing. That’s why Dropbox created Paper, to act as that missing collaboration layer. They announced some enhancements to Paper to keep people working in their collaboration tool without having to switch programs.

“Paper is Dropbox’s collaborative workspace for teams. It includes features where users can work together, assign owners to tasks with due dates and embed rich content like video, sound, photos from Youtube, SoundCloud, Pinterest and others,” a Dropbox spokesperson told TechCrunch.

With today’s enhancements you can paste a number of elements into Paper and get live previews. For starters, they are letting you link to a Dropbox folder in Paper, where you can view the files inside the folder, even navigating any sub-folders. When the documents in the folder change, Paper updates the preview automatically because the folder is actually a live link to the Dropbox folder. This one seems like a table stakes feature for a company like Dropbox.

Gif: Dropbox

In addition, Dropbox now supports Airtables, a kind of souped up spreadsheet. With the new enhancement, you just grab an Airtable embed code and drop it into Paper. From there, you can see a preview in whatever Airtable view you’ve saved the table.

Finally, Paper now supports LucidCharts. As with Airtables and folders, you simply paste the link and you can see a live preview inside Paper. If the original chart changes, updates are reflected automatically in the Paper preview.

By now, it’s clear that workers want to maintain focus and not be constantly switching between programs. It’s why Box created the recently announced Activity Stream and Recommended Apps. It’s why Slack has become so popular inside enterprises. These tools provide a way to share content from different enterprise apps without having to open a bunch of tabs or separate apps.

Dropbox Paper is also about giving workers a central place to do their work where you can pull live content previews from different apps without having to work in a bunch of content silos. Dropbox is trying to push that idea along for its enterprise customers with today’s enhancements.

Aug
29
2018
--

Storage provider Cloudian raises $94M

Cloudian, a company that specializes in helping businesses store petabytes of data, today announced that it has raised a $94 million Series E funding round. Investors in this round, which is one of the largest we have seen for a storage vendor, include Digital Alpha, Fidelity Eight Roads, Goldman Sachs, INCJ, JPIC (Japan Post Investment Corporation), NTT DOCOMO Ventures and WS Investments. This round includes a $25 million investment from Digital Alpha, which was first announced earlier this year.

With this, the seven-year-old company has now raised a total of $174 million.

As the company told me, it now has about 160 employees and 240 enterprise customers. Cloudian has found its sweet spot in managing the large video archives of entertainment companies, but its customers also include healthcare companies, automobile manufacturers and Formula One teams.

What’s important to stress here is that Cloudian’s focus is on on-premise storage, not cloud storage, though it does offer support for multi-cloud data management, as well. “Data tends to be most effectively used close to where it is created and close to where it’s being used,” Cloudian VP of worldwide sales Jon Ash told me. “That’s because of latency, because of network traffic. You can almost always get better performance, better control over your data if it is being stored close to where it’s being used.” He also noted that it’s often costly and complex to move that data elsewhere, especially when you’re talking about the large amounts of information that Cloudian’s customers need to manage.

Unsurprisingly, companies that have this much data now want to use it for machine learning, too, so Cloudian is starting to get into this space, as well. As Cloudian CEO and co-founder Michael Tso also told me, companies are now aware that the data they pull in, whether from IoT sensors, cameras or medical imaging devices, will only become more valuable over time as they try to train their models. If they decide to throw the data away, they run the risk of having nothing with which to train their models.

Cloudian plans to use the new funding to expand its global sales and marketing efforts and increase its engineering team. “We have to invest in engineering and our core technology, as well,” Tso noted. “We have to innovate in new areas like AI.”

As Ash also stressed, Cloudian’s business is really data management — not just storage. “Data is coming from everywhere and it’s going everywhere,” he said. “The old-school storage platforms that were siloed just don’t work anywhere.”

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com