Jun
22
2020
--

Hasura launches managed cloud service for its open-source GraphQL API platform

Hasura is an open-source engine that can connect to PostgreSQL databases and microservices across hybrid- and multi-cloud environments and then automatically build a GraphQL API backend for them, making it easier for developers to then build their own data-driven applications on top of this unified API . For a while now, the San Francisco-based startup has offered a paid version (Hasura Pro) with enterprise-ready reliability and security tools, in addition to its free open-source version. Today, the company launched Hasura Cloud, which takes the existing Pro version, adds a number of cloud-specific features like dynamic caching, auto-scaling and consumption-based pricing, and brings those together in a fully managed service.

Image Credits: Hasura

At its core, Hasura’s service promises businesses the ability to bring together data from their various siloed databases and allow their developers to extract value from them through its GraphQL APIs. While GraphQL is still relatively new, the Facebook-incubated technology has quickly become extremely popular among many development teams.

Before founding the company and launching it in 2018, Hasura CEO and co-founder Tanmai Gopal worked for a consulting firm — and like with so many founders, that’s where he got the inspiration for the service.

“One of the key things that we noticed was that in the entire landscape, computing is becoming better, there are better frameworks, it is easier to deploy code, databases are becoming better and they kind of work everywhere,” he said. “But this kind of piece in the middle that is still a bottleneck and that there isn’t really a good solution for is this data access piece.” Almost by default, most companies host data in various SaaS services and databases — and now they were trying to figure out how to develop apps based on this for both internal and external consumers, noted Gopal. “This data distribution problem was this bottleneck where everybody would just spend massive amounts of time and money. And we invented a way of kind of automating that,” he explained.

The choice of GraphQL was also pretty straightforward, especially because GraphQL services are an easy way for developers to consume data (even though, as Gopal noted, it’s not always fun to build the GraphQL service itself). One thing that’s unusual and worth noting about the core Hasura engine itself is that it is written in Haskell, which is a rather unusual choice.

Image Credits: Hasura

The team tells me that Hasura is now nearing 50 million downloads for its free version and the company is seeing large and small users from across various industries relying on its products, which is probably no surprise, given that the company is trying to solve a pretty universal problem around data access and consumption.

Over the last few quarters, the team worked on launching its cloud service. “We’ve been thinking of the cloud in a very different way,” Gopal said. “It’s not your usual, take the open-source solution and host it, like a MongoDB Atlas or Confluent. What we’ve done is we’ve said, we’re going to re-engineer the open-source solution to be entirely multi-tenant and be completely pay-per pricing.”

Given this philosophy, it’s no surprise that Hasura’s pricing is purely based on how much data a user moves through the service. “It’s much closer to our value proposition,” Hasura co-founder and COO Rajoshi Ghosh said. “The value proposition is about data access. The big part of it is the fact that you’re getting this data from your databases. But the very interesting part is that this data can actually come from anywhere. This data could be in your third-party services, part of your data could be living in Stripe and it could be living in Salesforce, and it could be living in other services. […] We’re the data access infrastructure in that sense. And this pricing also — from a mental model perspective — makes it much clearer that that’s the value that we’re adding.”

Now, there are obviously plenty of other data-centric API services on the market, but Gopal argues that Hasura has an advantage because of its advanced caching for dynamic data, for example.

May
12
2020
--

Microsoft partners with Redis Labs to improve its Azure Cache for Redis

For a few years now, Microsoft has offered Azure Cache for Redis, a fully managed caching solution built on top of the open-source Redis project. Today, it is expanding this service by adding Redis Enterprise, Redis Lab’s commercial offering, to its platform. It’s doing so in partnership with Redis Labs and while Microsoft will offer some basic support for the service, Redis Labs will handle most of the software support itself.

Julia Liuson, Microsoft’s corporate VP of its developer tools division, told me that the company wants to be seen as a partner to open-source companies like Redis Labs, which was among the first companies to change its license to prevent cloud vendors from commercializing and repackaging their free code without contributing back to the community. Last year, Redis Labs partnered with Google Cloud to bring its own fully managed service to its platform and so maybe it’s no surprise that we are now seeing Microsoft make a similar move.

Liuson tells me that with this new tier for Azure Cache for Redis, users will get a single bill and native Azure management, as well as the option to deploy natively on SSD flash storage. The native Azure integration should also make it easier for developers on Azure to integrate Redis Enterprise into their applications.

It’s also worth noting that Microsoft will support Redis Labs’ own Redis modules, including RediSearch, a Redis-powered search engine, as well as RedisBloom and RedisTimeSeries, which provide support for new datatypes in Redis.

“For years, developers have utilized the speed and throughput of Redis to produce unbeatable responsiveness and scale in their applications,” says Liuson. “We’ve seen tremendous adoption of Azure Cache for Redis, our managed solution built on open source Redis, as Azure customers have leveraged Redis performance as a distributed cache, session store, and message broker. The incorporation of the Redis Labs Redis Enterprise technology extends the range of use cases in which developers can utilize Redis, while providing enhanced operational resiliency and security.”

Mar
28
2019
--

Vizion.ai launches its managed Elasticsearch service

Setting up Elasticsearch, the open-source system that many companies large and small use to power their distributed search and analytics engines, isn’t the hardest thing. What is very hard, though, is to provision the right amount of resources to run the service, especially when your users’ demand comes in spikes, without overpaying for unused capacity. Vizion.ai’s new Elasticsearch Service does away with all of this by essentially offering Elasticsearch as a service and only charging its customers for the infrastructure they use.

Vizion.ai’s service automatically scales up and down as needed. It’s a managed service and delivered as a SaaS platform that can support deployments on both private and public clouds, with full API compatibility with the standard Elastic stack that typically includes tools like Kibana for visualizing data, Beats for sending data to the service and Logstash for transforming the incoming data and setting up data pipelines. Users can easily create several stacks for testing and development, too, for example.

Vizion.ai GM and VP Geoff Tudor

“When you go into the AWS Elasticsearch service, you’re going to be looking at dozens or hundreds of permutations for trying to build your own cluster,” Vision.ai’s VP and GM Geoff Tudor told me. “Which instance size? How many instances? Do I want geographical redundancy? What’s my networking? What’s my security? And if you choose wrong, then that’s going to impact the overall performance. […] We do balancing dynamically behind that infrastructure layer.” To do this, the service looks at the utilization patterns of a given user and then allocates resources to optimize for the specific use case.

What VVizion.ai hasdone here is take some of the work from its parent company Panzura, a multi-cloud storage service for enterprises that has plenty of patents around data caching, and applied it to this new Elasticsearch service.

There are obviously other companies that offer commercial Elasticsearch platforms already. Tudor acknowledges this, but argues that his company’s platform is different. With other products, he argues, you have to decide on the size of your block storage for your metadata upfront, for example, and you typically want SSDs for better performance, which can quickly get expensive. Thanks to Panzura’s IP, Vizion.ai is able to bring down the cost by caching recent data on SSDs and keeping the rest in cheaper object storage pools.

He also noted that the company is positioning the overall Vizion.ai service, with the Elasticsearch service as one of the earliest components, as a platform for running AI and ML workloads. Support for TensorFlow, PredictionIO (which plays nicely with Elasticsearch) and other tools is also in the works. “We want to make this an easy serverless ML/AI consumption in a multi-cloud fashion, where not only can you leverage the compute, but you can also have your storage of record at a very cost-effective price point.”

Sep
19
2018
--

Google’s Cloud Memorystore for Redis is now generally available

After five months in public beta, Google today announced that its Cloud Memorystore for Redis, its fully managed in-memory data store, is now generally available.

The service, which is fully compatible with the Redis protocol, promises to offer sub-millisecond responses for applications that need to use in-memory caching. And because of its compatibility with Redis, developers should be able to easily migrate their applications to this service without making any code changes.

Cloud Memorystore offers two service tiers — a basic one for simple caching and a standard tier for users who need a highly available Redis instance. For the standard tier, Google offers a 99.9 percent availability SLA.

Since it first launched in beta, Google added a few additional capabilities to the service. You can now see your metrics in Stackdriver, for example. Google also added custom IAM roles and improved logging.

As for pricing, Google charges per GB-hour, depending on the service level and capacity you use. You can find the full pricing list here.

Dec
01
2017
--

This Week in Data with Colin Charles 17: AWS Re:Invent, a New Book on MySQL Cluster and Another Call Out for Percona Live 2018

Colin Charles

Colin Charles Open Source Database evangelist for PerconaJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

The CFP for Percona Live Santa Clara 2018 closes December 22, 2017: please consider submitting as soon as possible. We want to make an early announcement of talks, so we’ll definitely do a first pass even before the CFP date closes. Keep in mind the expanded view of what we are after: it’s more than just MySQL and MongoDB. And don’t forget that with one day less, there will be intense competition to fit all the content in.

A new book on MySQL Cluster is out: Pro MySQL NDB Cluster by Jesper Wisborg Krogh and Mikiya Okuno. At 690 pages, it is a weighty tome, and something I fully plan on reading, considering I haven’t played with NDBCLUSTER for quite some time.

Did you know that since MySQL 5.7.17, connection control plugins are included? They help DBAs introduce an increasing delay in server response to clients after a certain number of consecutive failed connection attempts. Read more at the connection control plugins.

While there are a tonne of announcements coming out from the Amazon re:Invent 2017 event, I highly recommend also reading Some data of interest as AWS reinvent 2017 ramps up by James Governor. Telemetry data from sumologic’s 1,500 largest customers suggest that NoSQL database usage has overtaken relational database workloads! Read The State of Modern Applications in the Cloud. Page 8 tells us that MySQL is the #1 database on AWS (I don’t see MariaDB Server being mentioned which is odd; did they lump it in together?), and MySQL, Redis & MongoDB account for 40% of database adoption on AWS. In other news, Andy Jassy also mentions that less than 1.5 months after hitting 40,000 database migrations, they’ve gone past 45,000 over the Thanksgiving holiday last week. Have you started using AWS Database Migration Service?

Releases

Link List

Upcoming appearances

  • ACMUG 2017 gathering – Beijing, China, December 9-10 2017 – it was very exciting being there in 2016, I can only imagine it’s going to be bigger and better in 2017, since it is now two days long!

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

Apr
29
2014
--

ScaleArc: Real-world application testing with WordPress (benchmark test)

ScaleArc recently hired Percona to perform various tests on its database traffic management product. This post is the outcome of the benchmarks carried out by me and ScaleArc co-founder and chief architect, Uday Sawant.

The goal of this benchmark was to identify ScaleArc’s overhead using a real-world application – the world’s most popular (according to wikipedia) content management system and blog engine: WordPress.

The tests also sought to identify the benefit of caching for this type of workload. The caching parameters represent more real-life circumstances than we applied in the sysbench performance tests – the goal here was not just to saturate the cache. For this reason, we created an artificial WordPress blog with generated data.

The size of the database was roughly 4G. For this particular test, we saw that using ScaleArc introduces very little overhead and caching increased the throughput 3.5 times at peak capacity. In terms of response times, response times on queries for which we had a cache hit decreased substantially. For example, a 5-second main page load became less than 1 second when we had cache hits on certain queries. It’s a bit hard to talk about response time here in general, because WordPress itself has different requests that are associated with different costs (computationally) and which have different response times.

Test description

The pre-generated test database contained the following:

  • 100 users
  • 25 categories
  • 100.000 posts (stories)
  • 300.000 comments (3 per post)

One iteration of the load contained the following:

  • Homepage retrieval
  • 10 story (post) page retrieval
  • 3 category page retrieval
  • Log in as a random user
  • That random user posted a new story and commented on an existing post

We think that the usage pattern is close to reality – most people just visit blogs, but some write posts and comments. For the test, we used WordPress version 3.8.1. We wrote a simple shell script that could do these iterations using multiple processes. Some of this testing pattern, however, is not realistic. Some posts will always have many more comments than others, and some posts won’t have any comments at all. This test doesn’t take that nuance into account, but that doesn’t change the big picture. Choosing a random post to comment on will give us a uniform comment distribution.

We measured 3 scenarios:

  • Direct connection to the database (direct_wp).
  • Connection through ScaleArc without caching.
  • Connection through ScaleArc with caching enabled.

When caching is enabled, queries belonging to comments were cached for 5 minutes, queries belonging to the home page were cached for 15 minutes, and queries belonging to stories (posts) were cached for 30 minutes.

We varied the number of parallel iterations. Each test ran for an hour.

Results for direct database connection

Threads: 1, Iterations: 180, Time[sec]: 3605
   Threads: 2, Iterations: 356, Time[sec]: 3616
   Threads: 4, Iterations: 780, Time[sec]: 3618
   Threads: 8, Iterations: 1408, Time[sec]: 3614
   Threads: 16, Iterations: 2144, Time[sec]: 3619
   Threads: 32, Iterations: 2432, Time[sec]: 3646
   Threads: 64, Iterations: 2368, Time[sec]: 3635
   Threads: 128, Iterations: 2432, Time[sec]: 3722

The result above is the summary output of the script we used. The data shows we reach peak capacity at 32 concurrent threads.

Results for connecting through ScaleArc

Threads: 1, Iterations: 171, Time[sec]: 3604
   Threads: 2, Iterations: 342, Time[sec]: 3606
   Threads: 4, Iterations: 740, Time[sec]: 3619
   Threads: 8, Iterations: 1304, Time[sec]: 3609
   Threads: 16, Iterations: 2048, Time[sec]: 3625
   Threads: 32, Iterations: 2336, Time[sec]: 3638
   Threads: 64, Iterations: 2304, Time[sec]: 3678
   Threads: 128, Iterations: 2304, Time[sec]: 3675

The results are almost identical. Because a typical query in this example is quite expensive, the overhead of ScaleArc here is barely measurable.

Results for connecting through ScaleArc with caching enabled

Threads: 1, Iterations: 437, Time[sec]: 3601
   Threads: 2, Iterations: 886, Time[sec]: 3604
   Threads: 4, Iterations: 1788, Time[sec]: 3605
   Threads: 8, Iterations: 3336, Time[sec]: 3600
   Threads: 16, Iterations: 6880, Time[sec]: 3606
   Threads: 32, Iterations: 8832, Time[sec]: 3600
   Threads: 64, Iterations: 9024, Time[sec]: 3614
   Threads: 128, Iterations: 8576, Time[sec]: 3630

Caching improved response time even for a single thread. At 32 threads, we see more than 3.5x improvement in throughput. Caching is a great help here for the same reason the overhead is barely measurable: the queries are more expensive in general, so more resources are spared when they are not run.

Throughput
From the web server’s access log, we created a per-second throughput graph. We are talking about requests per second here. Please note that the variance is relatively high, because the requests are not identical – retrieving the main page is a different request and has a different cost then retrieving a story page.

throughput

The red and blue dots are literally plotted on top of each other – the green bar is always on top of them. The green ones have a greater variance because even though we had caching enabled during the test, we used more realistic TTLs in this cache, so cached items did actually expire during the test. When the cache was expired, requests took longer, so the throughput was lower. When the cache was populated, requests took a shorter amount of time, so the throughput was higher.

CPU utilization

cpu_util

CPU utilization characteristics are pretty much the same on the left and right sides (direct connection on the left and ScaleArc without caching on the right). In the middle, we can see that the web server’s CPU gets completely utilized sooner with caching. Because data comes faster from the cache, it serves more requests, which costs more resources computationally. On the other hand, the database server’s CPU utilization is significantly lower when caching is used. The bar is on the top on the left and right sides – in the middle, we have bars both at the top and at the bottom. The test is utilizing the database server’s CPU completely only when we hit cache misses.

Because ScaleArc serves the cache hits, and these requests are not hitting the database, the database is not used at all when requests are served from the cache. In the case of tests with caching on, the bottleneck became the web server, which is a component that is a lot easier to scale than the database.

There are two more key points to take away here. First, regardless of whether caching is turned on or off, this workload is not too much for ScaleArc. Second, the client we ran the measurement scripts on was not the bottleneck.

Conclusion
The goal of these benchmarks was to show that ScaleArc has very little overhead and that caching can be beneficial for a real-world application, which has a “read mostly” workload with relatively expensive reads (expensive means that network round trip is not a significant contributor in the read’s response time). A blog is exactly that type – typically, more people are visiting than commenting. The test showed that ScaleArc is capable of supporting this scenario well, delivering 3.5x throughput at peak capacity. It’s worth mentioning that if this system needs to be scaled, more web servers could be added as well as more read slaves. Those read slaves can take up read queries by a WordPress plugin which allows this, or by ScaleArc’s read-write splitting facility (it threats autocommit selects as reads), in the later case, the caching benefit is present for the slaves as well.

The post ScaleArc: Real-world application testing with WordPress (benchmark test) appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com