Apr
10
2019
--

Google launches its coldest storage service yet

At its Cloud Next conference, Google today launched a new archival cold storage service. This new service, which doesn’t seem to have a fancy name, will complement the company’s existing Nearline and Coldline services for storing vast amounts of infrequently used data at an affordable low cost.

The new archive class takes this one step further, though. It’s cheap, with prices starting at $0.0012 per gigabyte and month. That’s $1.23 per terabyte and month.

The new service will become available later this year.

What makes Google cold storage different from the likes of AWS S3 Glacier, for example, is that the data is immediately available, without millisecond latency. Glacier and similar service typically make you wait a significant amount of time before the data can be used. Indeed, in a thinly veiled swipe at AWS, Google directors of product management Dominic Preuss and Dave Nettleton note that “unlike tape and other glacially slow equivalents, we have taken an approach that eliminates the need for a separate retrieval process and provides immediate, low-latency access to your content.”

To put that into context, a gigabyte stored in AWS Glacier will set you back $0.004 per month. AWS offers another option, though: AWS Glacier Deep Archive. This service recently went live, at the cost of $0.00099 per gigabyte and month, though with significantly longer retrieval times.

Google’s new object storage service uses the same APIs as Google’s other storage classes and Google promises that the data is always redundantly stored across availability zones, with eleven 9’s of annual durability.

In a press conference ahead of today’s official announcement, Preuss noted that this service mostly a replacement for on-premise tape backups, but now that many enterprises try to keep as much data as they can to then later train their machine learning models, for example, the amounts of fresh data that needs to be stored for the long term continues to increase rapidly, too.

With low latency and the promise of high availability, there obviously has to be a drawback here, otherwise Google wouldn’t (and couldn’t) offer this service at this price. “Just like when you’re going from our standard [storage] class to Nearline or Coldline, there’s a committed amount of time that you have to remain in that class,” Preuss explained. “So basically, to get a lower price you are committing to keep the data in the Google Cloud Storage bucket for a period of time.”

Correction: a previous version of the post said that AWS Glacier Deep Archive wasn’t available yet when it actually went live two weeks ago. We changed the post to reflect this. 

Mar
20
2019
--

Portworx raises $27M Series C for its cloud-native data management platform

As enterprises adopt cloud-native technologies like containers to build their applications, the next question they often have to ask themselves is how they adapt their data storage and management practices to this new reality, too. One of the companies in this business is the four-year-old Portworx, which has managed to attract customers like Lufthansa Systems, GE Digital and HPE with its cloud-native storage and data-management platform for the Kubernetes container orchestration platform.

Portworx today announced that it has raised a $27 million Series C funding round led by  Sapphire Ventures and the ventures arm of Abu Dhabi’s Mubadala Investment Company. Existing investors Mayfield Fund and GE Ventures also participated, as well as new investors Cisco, HPE and NetApp, which clearly have a strategic interest in bringing Portworx’s storage offering to their own customers, too, and partnering with the company.

Portworx’s tools make it easier for developers to migrate data, create backups and recover them after an issue. The service supports most popular databases, including Cassandra, Redis and MySQL, but also other storage services. Essentially, it creates a storage layer for database containers or other stateful containers that your apps can then access, no matter where they run or where the data resides.

“As the cloud-native stack matures, Portworx’s leadership in the data layer is really what is highlighted by our funding,” Portworx CEO and co-founder Murli Thirumale told me. “We clearly have a significant number of customers, there is a lot of customer growth, our partner network is growing. What you are seeing is that within that cloud-native ecosystem, we have the maximum number of production deployments and that momentum is something we’re continuing to fuel and fund with this round.”

As Portworx CEO and co-founder Murli Thirumale told me, the company expanded its customer base by over 100 percent last year and increased its total bookings by 376 percent year-over-year. That’s obviously the kind of growth that investors want to see. Thirumale noted, though, that the company wasn’t under any pressure to raise at this point. “We were seeing such strong growth momentum that we knew we need the money to fuel the growth.” That means expanding the company’s sales force, especially internationally, as well as its support team to help its new customers manage their data lifecycle.

In addition to today’s funding round, Portworx also today announced the latest version of its flagship Portworx Enterprise platform, which now includes new data security and disaster recovery functions. These include improved role-based access controls that go beyond what Kubernetes traditionally offers (and that integrate with existing enterprise systems). The new disaster recovery tools now allow enterprises to make incremental backups to data centers that sit in different geographical locations. Maybe more importantly, Portworx now also lets users automatically save data in two nearby data centers zones as updates happen. That’s meant o enable use cases where zero data loss would be acceptable in the case of an outage. With this, a company could automatically backup data from a database that sits in Azure Germany Central and back it up to AWS Europe Frankfurt, for example.

Jan
24
2019
--

Humio raises $9M Series A for its real-time log analysis platform

Humio, a startup that provides a real-time log analysis platform for on-premises and cloud infrastructures, today announced that it has raised a $9 million Series A round led by Accel. It previously raised its seed round from WestHill and Trifork.

The company, which has offices in San Francisco, the U.K. and Denmark, tells me that it saw a 13x increase in its annual revenue in 2018. Current customers include Bloomberg, Microsoft and Netlify .

“We are experiencing a fundamental shift in how companies build, manage and run their systems,” said Humio CEO Geeta Schmidt. “This shift is driven by the urgency to adopt cloud-based and microservice-driven application architectures for faster development cycles, and dealing with sophisticated security threats. These customer requirements demand a next-generation logging solution that can provide live system observability and efficiently store the massive amounts of log data they are generating.”

To offer them this solution, Humio raised this round with an eye toward fulfilling the demand for its service, expanding its research and development teams and moving into more markets across the globe.

As Schmidt also noted, many organizations are rather frustrated by the log management and analytics solutions they currently have in place. “Common frustrations we hear are that legacy tools are too slow — on ingestion, searches and visualizations — with complex and costly licensing models,” she said. “Ops teams want to focus on operations — not building, running and maintaining their log management platform.”

To build this next-generation analysis tool, Humio built its own time series database engine to ingest the data, with open-source tools like Scala, Elm and Kafka in the backend. As data enters the pipeline, it’s pushed through live searches and then stored for later queries. As Humio VP of Engineering Christian Hvitved tells me, though, running ad-hoc queries is the exception, and most users only do so when they encounter bugs or a DDoS attack.

The query language used for the live filters is also pretty straightforward. That was a conscious decision, Hvitved said. “If it’s too hard, then users don’t ask the question,” he said. “We’re inspired by the Unix philosophy of using pipes, so in Humio, larger searches are built by combining smaller searches with pipes. This is very familiar to developers and operations people since it is how they are used to using their terminal.”

Humio charges its customers based on how much data they want to ingest and for how long they want to store it. Pricing starts at $200 per month for 30 days of data retention and 2 GB of ingested data.

Dec
13
2018
--

They scaled YouTube — now they’ll shard everyone with PlanetScale

When the former CTOs of YouTube, Facebook and Dropbox seed fund a database startup, you know there’s something special going on under the hood. Jiten Vaidya and Sugu Sougoumarane saved YouTube from a scalability nightmare by inventing and open-sourcing Vitess, a brilliant relational data storage system. But in the decade since working there, the pair have been inundated with requests from tech companies desperate for help building the operational scaffolding needed to actually integrate Vitess.

So today the pair are revealing their new startup PlanetScale that makes it easy to build multi-cloud databases that handle enormous amounts of information without locking customers into Amazon, Google or Microsoft’s infrastructure. Battle-tested at YouTube, the technology could allow startups to fret less about their backend and focus more on their unique value proposition. “Now they don’t have to reinvent the wheel” Vaidya tells me. “A lot of companies facing this scaling problem end up solving it badly in-house and now there’s a way to solve that problem by using us to help.”

PlanetScale quietly raised a $3 million seed round in April, led by SignalFire and joined by a who’s who of engineering luminaries. They include YouTube co-founder and CTO Steve Chen, Quora CEO and former Facebook CTO Adam D’Angelo, former Dropbox CTO Aditya Agarwal, PayPal and Affirm co-founder Max Levchin, MuleSoft co-founder and CTO Ross Mason, Google director of engineering Parisa Tabriz and Facebook’s first female engineer and South Park Commons founder Ruchi Sanghvi. If anyone could foresee the need for Vitess implementation services, it’s these leaders, who’ve dealt with scaling headaches at tech’s top companies.

But how can a scrappy startup challenge the tech juggernauts for cloud supremacy? First, by actually working with them. The PlanetScale beta that’s now launching lets companies spin up Vitess clusters on its database-as-a-service, their own through a licensing deal, or on AWS with Google Cloud and Microsoft Azure coming shortly. Once these integrations with the tech giants are established, PlanetScale clients can use it as an interface for a multi-cloud setup where they could keep their data master copies on AWS US-West with replicas on Google Cloud in Ireland and elsewhere. That protects companies from becoming dependent on one provider and then getting stuck with price hikes or service problems.

PlanetScale also promises to uphold the principles that undergirded Vitess. “It’s our value that we will keep everything in the query pack completely open source so none of our customers ever have to worry about lock-in” Vaidya says.

PlanetScale co-founders (from left): Jiten Vaidya and Sugu Sougoumarane

Battle-tested, YouTube-approved

He and Sougoumarane met 25 years ago while at Indian Institute of Technology Bombay. Back in 1993 they worked at pioneering database company Informix together before it flamed out. Sougoumarane was eventually hired by Elon Musk as an early engineer for X.com before it got acquired by PayPal, and then left for YouTube. Vaidya was working at Google and the pair were reunited when it bought YouTube and Sougoumarane pulled him on to the team.

“YouTube was growing really quickly and the relationship database they were using with MySQL was sort of falling apart at the seams,” Vaidya recalls. Adding more CPU and memory to the database infra wasn’t cutting it, so the team created Vitess. The horizontal scaling sharding middleware for MySQL let users segment their database to reduce memory usage while still being able to rapidly run operations. YouTube has smoothly ridden that infrastructure to 1.8 billion users ever since.

“Sugu and Mike Solomon invented and made Vitess open source right from the beginning since 2010 because they knew the scaling problem wasn’t just for YouTube, and they’ll be at other companies five or 10 years later trying to solve the same problem,” Vaidya explains. That proved true, and now top apps like Square and HubSpot run entirely on Vitess, with Slack now 30 percent onboard.

Vaidya left YouTube in 2012 and became the lead engineer at Endorse, which got acquired by Dropbox, where he worked for four years. But in the meantime, the engineering community strayed toward MongoDB-style non-relational databases, which Vaidya considers inferior. He sees indexing issues and says that if the system hiccups during an operation, data can become inconsistent — a big problem for banking and commerce apps. “We think horizontally scaled relationship databases are more elegant and are something enterprises really need.

Database legends reunite

Fed up with the engineering heresy, a year ago Vaidya committed to creating PlanetScale. It’s composed of four core offerings: professional training in Vitess, on-demand support for open-source Vitess users, Vitess database-as-a-service on PlanetScale’s servers and software licensing for clients that want to run Vitess on premises or through other cloud providers. It lets companies re-shard their databases on the fly to relocate user data to comply with regulations like GDPR, safely migrate from other systems without major codebase changes, make on-demand changes and run on Kubernetes.

The PlanetScale team

PlanetScale’s customers now include Indonesian e-commerce giant Bukalapak, and it’s helping Booking.com, GitHub and New Relic migrate to open-source Vitess. Growth is suddenly ramping up due to inbound inquiries. Last month around when Square Cash became the No. 1 app, its engineering team published a blog post extolling the virtues of Vitess. Now everyone’s seeking help with Vitess sharding, and PlanetScale is waiting with open arms. “Jiten and Sugu are legends and know firsthand what companies require to be successful in this booming data landscape,” says Ilya Kirnos, founding partner and CTO of SignalFire.

The big cloud providers are trying to adapt to the relational database trend, with Google’s Cloud Spanner and Cloud SQL, and Amazon’s AWS SQL and AWS Aurora. Their huge networks and marketing war chests could pose a threat. But Vaidya insists that while it might be easy to get data into these systems, it can be a pain to get it out. PlanetScale is designed to give them freedom of optionality through its multi-cloud functionality so their eggs aren’t all in one basket.

Finding product market fit is tough enough. Trying to suddenly scale a popular app while also dealing with all the other challenges of growing a company can drive founders crazy. But if it’s good enough for YouTube, startups can trust PlanetScale to make databases one less thing they have to worry about.

Nov
28
2018
--

AWS tries to lure Windows users with Amazon FSx for Windows File Server

Amazon has had storage options for Linux file servers for some time, but it recognizes that a number of companies still use Windows file servers, and they are not content to cede that market to Microsoft. Today the company announced Amazon FSx for Windows File Server to provide a fully compatible Windows option.

“You get a native Windows file system backed by fully-managed Windows file servers, accessible via the widely adopted SMB (Server Message Block) protocol. Built on SSD storage, Amazon FSx for Windows File Server delivers the throughput, IOPS, and consistent sub-millisecond performance that you (and your Windows applications) expect,” AWS’s Jeff Barr wrote in a blog post introducing the new feature.

That means if you use this service, you have a first-class Windows system with all of the compatibility with Windows services that you would expect, such as Active Directory and Windows Explorer.

AWS CEO Andy Jassy introduced the new feature today at AWS re:Invent, the company’s customer conference going on in Las Vegas this week. He said that even though Windows File Server usage is diminishing as more IT pros turn to Linux, there are still a fair number of customers who want a Windows-compatible system and they wanted to provide a service for them to move their Windows files to the cloud.

Of course, it doesn’t hurt that it provides a path for Microsoft customers to use AWS instead of turning to Azure for these workloads. Companies undertaking a multi-cloud strategy should like having a fully compatible option.

more AWS re:Invent 2018 coverage

Oct
25
2018
--

Dropbox expands Paper into planning tool with timelines

Dropbox has been building out Paper, its document-driven collaboration tool since it was first announced in 2015, slowly but surely layering on more functionality. Today, it added a timeline feature, pushing beyond collaboration into a light-weight project planning tool.

Dropbox has been hearing that customers really need a way to plan with Paper that was lacking. “That pain—the pain of coordinating all those moving pieces—is one we’re taking on today with our new timelines feature in Dropbox Paper,” the company wrote in a blog post announcing the new feature.

As you would expect with such a tool, it enables you to build a timeline with milestones, but being built into Paper, you can assign team members to each milestone and add notes with additional information including links to related documents.

You can also embed a To-do lists for the person assigned to a task right in the timeline to help them complete the given task, giving a single point of access for all the people assigned to a project

Gif: Dropbox

“Features like to-dos, @mentions, and due dates give team members easy ways to coordinate projects with each other. Timelines take these capabilities one step further, letting any team member create a clean visual representation of what’s happening when—and who’s responsible,” Dropbox wrote in the blog post announcement.

Dropbox has recognized it cannot live as simply a content storage tool. It needs to expand beyond that into collaboration and coordination around that content, and that’s what Dropbox Paper has been about. By adding timelines, the company is looking to expand that capability even further.

Alan Lepofsky, who covers the “future of work” for Constellation Research sees Paper as part of the changing face of collaboration tools. “I refer to the new breed of content creation tools as digital canvases. These apps simplify the user experience of integrating content from multiple sources. They are evolving the word-processor paradigm,” Lepofsky told TechCrunch.

It’s probably not going to replace a project manager’s full-blown planning tools any time soon, but it at least the potential to be a useful adjunct for the Paper arsenal to allow customers to continue to find ways to extract value from the content they store in Dropbox.

Oct
10
2018
--

Google introduces dual-region storage buckets to simplify data redundancy

Google is playing catch-up in the cloud, and as such it wants to provide flexibility to differentiate itself from AWS and Microsoft. Today, the company announced a couple of new options to help separate it from the cloud storage pack.

Storage may seem stodgy, but it’s a primary building block for many cloud applications. Before you can build an application you need the data that will drive it, and that’s where the storage component comes into play.

One of the issues companies have as they move data to the cloud is making sure it stays close to the application when it’s needed to reduce latency. Customers also require redundancy in the event of a catastrophic failure, but still need access with low latency. The latter has been a hard problem to solve until today when Google introduced a new dual-regional storage option.

As Google described it in the blog post announcing the new feature, “With this new option, you write to a single dual-regional bucket without having to manually copy data between primary and secondary locations. No replication tool is needed to do this and there are no network charges associated with replicating the data, which means less overhead for you storage administrators out there. In the event of a region failure, we transparently handle the failover and ensure continuity for your users and applications accessing data in Cloud Storage.”

This allows companies to have redundancy with low latency, while controlling where it goes without having to manually move it should the need arise.

Knowing what you’re paying

Companies don’t always require instant access to data, and Google (and other cloud vendors) offer a variety of storage options, making it cheaper to store and retrieve archived data. As of today, Google is offering a clear way to determine costs, based on customer storage choice types. While it might not seem revolutionary to let customers know what they are paying, Dominic Preuss, Google’s director of product management says it hasn’t always been a simple matter to calculate these kinds of costs in the cloud. Google decided to simplify it by clearly outlining the costs for medium (Nearline) and long-term (Coldline) storage across multiple regions.

As Google describes it, “With multi-regional Nearline and Coldline storage, you can access your data with millisecond latency, it’s distributed redundantly across a multi-region (U.S., EU or Asia), and you pay archival prices. This is helpful when you have data that won’t be accessed very often, but still needs to be protected with geographically dispersed copies, like media archives or regulated content. It also simplifies management.”

Under the new plan, you can select the type of storage you need, the kind of regional coverage you want and you can see exactly what you are paying.

Google Cloud storage pricing options. Chart: Google

Each of these new storage services has been designed to provide additional options for Google Cloud customers, giving them more transparency around pricing and flexibility and control over storage types, regions and the way they deal with redundancy across data stores.

Oct
10
2018
--

Egnyte hauls in $75M investment led by Goldman Sachs

Egnyte launched in 2007 just two years after Box, but unlike its enterprise counterpart, which went all-cloud and raised hundreds of millions of dollars, Egnyte saw a different path with a slow and steady growth strategy and a hybrid niche, recognizing that companies were going to keep some content in the cloud and some on prem. Up until today it had raised a rather modest $62.5 million, and hadn’t taken a dime since 2013, but that all changed when the company announced a whopping $75 million investment.

The entire round came from a single investor, Goldman Sachs’ Private Capital Investing arm, a part of Goldman’s Special Situations group. Holger Staude, vice president of Goldman Sachs Private Capital Investing will join Egnyte’s board under the terms of the deal. He says Goldman liked what it saw, a steady company poised for bigger growth with the right influx of capital. In fact, the company has had more than eight straight quarters of growth and have been cash flow positive since Q4 in 2016.

“We were impressed by the strong management team and the company’s fiscal discipline, having grown their top line rapidly without requiring significant outside capital for the past several years. They have created a strong business model that we believe can be replicated with success at a much larger scale,” Staude explained.

Company CEO Vineet Jain helped start the company as a way to store and share files in a business context, but over the years, he has built that into a platform that includes security and governance components. Jain also saw a market poised for growth with companies moving increasing amounts of data to the cloud. He felt the time was right to take on more significant outside investment. He said his first step was to build a list of investors, but Goldman shined through, he said.

“Goldman had reached out to us before we even started the fundraising process. There was inbound interest. They were more aggressive compared to others. Given there was prior conversations, the path to closing was shorter,” he said.

He wouldn’t discuss a specific valuation, but did say they have grown 6x since the 2013 round and he got what he described as “a decent valuation.” As for an IPO, he predicted this would be the final round before the company eventually goes public. “This is our last fund raise. At this level of funding, we have more than enough funding to support a growth trajectory to IPO,” he said.

Philosophically, Jain has always believed that it wasn’t necessary to hit the gas until he felt the market was really there. “I started off from a point of view to say, keep building a phenomenal product. Keep focusing on a post sales experience, which is phenomenal to the end user. Everything else will happen. So this is where we are,” he said.

Jain indicated the round isn’t about taking on money for money’s sake. He believes that this is going to fuel a huge growth stage for the company. He doesn’t plan to focus these new resources strictly on the sales and marketing department, as you might expect. He wants to scale every department in the company including engineering, posts-sales and customer success.

Today the company has 450 employees and more than 14,000 customers across a range of sizes and sectors including Nasdaq, Thoma Bravo, AppDynamics and Red Bull. The deal closed at the end of last month.

Sep
27
2018
--

Dropbox overhauls internal search to improve speed and accuracy

Over the last several months, Dropbox has been undertaking an overhaul of its internal search engine for the first time since 2015. Today, the company announced that the new version, dubbed Nautilus, is ready for the world. The latest search tool takes advantage of a new architecture powered by machine learning to help pinpoint the exact piece of content a user is looking for.

While an individual user may have a much smaller body of documents to search across than the World Wide Web, the paradox of enterprise search says that the fewer documents you have, the harder it is to locate the correct one. Yet Dropbox faces of a host of additional challenges when it comes to search. It has more than 500 million users and hundreds of billions of documents, making finding the correct piece for a particular user even more difficult. The company had to take all of this into consideration when it was rebuilding its internal search engine.

One way for the search team to attack a problem of this scale was to put machine learning to bear on it, but it required more than an underlying level of intelligence to make this work. It also required completely rethinking the entire search tool from an architectural level.

That meant separating two main pieces of the system, indexing and serving. The indexing piece is crucial of course in any search engine. A system of this size and scope needs a fast indexing engine to cover the number of documents in a whirl of changing content. This is the piece that’s hidden behind the scenes. The serving side of the equation is what end users see when they query the search engine, and the system generates a set of results.

Nautilus Architecture Diagram: Dropbox

Dropbox described the indexing system in a blog post announcing the new search engine: “The role of the indexing pipeline is to process file and user activity, extract content and metadata out of it, and create a search index.” They added that the easiest way to index a corpus of documents would be to just keep checking and iterating, but that couldn’t keep up with a system this large and complex, especially one that is focused on a unique set of content for each user (or group of users in the business tool).

They account for that in a couple of ways. They create offline builds every few days, but they also watch as users interact with their content and try to learn from that. As that happens, Dropbox creates what it calls “index mutations,” which they merge with the running indexes from the offline builds to help provide ever more accurate results.

The indexing process has to take into account the textual content assuming it’s a document, but it also has to look at the underlying metadata as a clue to the content. They use this information to feed a retrieval engine, whose job is to find as many documents as it can, as fast it can and worry about accuracy later.

It has to make sure it checks all of the repositories. For instance, Dropbox Paper is a separate repository, so the answer could be found there. It also has to take into account the access-level security, only displaying content that the person querying has the right to access.

Once it has a set of possible results, it uses machine learning to pinpoint the correct content. “The ranking engine is powered by a [machine learning] model that outputs a score for each document based on a variety of signals. Some signals measure the relevance of the document to the query (e.g., BM25), while others measure the relevance of the document to the user at the current moment in time,” they explained in the blog post.

After the system has a list of potential candidates, it ranks them and displays the results for the end user in the search interface, but a lot of work goes into that from the moment the user types the query until it displays a set of potential files. This new system is designed to make that process as fast and accurate as possible.

Sep
24
2018
--

Microsoft Azure gets new high-performance storage options

Microsoft Azure is getting a number of new storage options today that mostly focus on use cases where disk performance matters.

The first of these is Azure Ultra SSD Managed Disks, which are now in public preview. Microsoft says that these drives will offer “sub-millisecond latency,” which unsurprisingly makes them ideal for workloads where latency matters.

Earlier this year, Microsoft launched its Premium and Standard SSD Managed Disks offerings for Azure into preview. These ‘ultra’ SSDs represent the next tier up from the Premium SSDs with even lower latency and higher throughput. They’ll offer 160,000 IOPS per second will less than a millisecond of read/write latency. These disks will come in sizes ranging from 4GB to 64TB.

And talking about Standard SSD Managed Disks, this service is now generally available after only three months in preview. To top things off, all of Azure’s storage tiers (Premium and Standard SSD, as well as Standard HDD) now offer 8, 16 and 32 TB storage capacity.

Also new today is Azure Premium files, which is now in preview. This, too, is an SSD-based service. Azure Files itself isn’t new, though. It offers users access to cloud storage using the standard SMB protocol. This new premium offering promises higher throughput and lower latency for these kind of SMB operations.

more Microsoft Ignite 2018 coverage

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com