Sep
20
2021
--

Fivetran hauls in $565M on $5.6B valuation, acquires competitor HVR for $700M

Fivetran, the data connectivity startup, had a big day today. For starters it announced a $565 million investment on a $5.6 billion valuation, but it didn’t stop there. It also announced its second acquisition this year, snagging HVR, a data integration competitor that had raised more than $50 million, for $700 million in cash and stock.

The company last raised a $100 million Series C on a $1.2 billion valuation, increasing the valuation by over 5x. As with that Series C, Andreessen Horowitz was back leading the round, with participation from other double dippers General Catalyst, CEAS Investments, Matrix Partners and other unnamed firms or individuals. New investors ICONIQ Capital, D1 Capital Partners and YC Continuity also came along for the ride. The company reports it has now raised $730 million.

The HVR acquisition represents a hefty investment for the startup, grabbing a company for a price that is almost equal to all the money it has raised to date, but it provides a way to expand its market quickly by buying a competitor. Earlier this year Fivetran acquired Teleport Data as it continues to add functionality and customers via acquisition.

“The acquisition — a cash and stock deal valued at $700 million — strengthens Fivetran’s market position as one of the data integration leaders for all industries and all customer types,” the company said in a statement.

While that may smack of corporate marketing-speak, there is some truth to it, as pulling data from multiple sources, sometimes in siloed legacy systems, is a huge challenge for companies, and both Fivetran and HVR have developed tools to provide the pipes to connect various data sources and put it to work across a business.

Data is central to a number of modern enterprise practices, including customer experience management, which takes advantage of customer data to deliver customized experiences based on what you know about them, and data is the main fuel for machine learning models, which use it to understand and learn how a process works. Fivetran and HVR provide the nuts and bolts infrastructure to move the data around to where it’s needed, connecting to various applications like Salesforce, Box or Airtable, databases like Postgres SQL or data repositories like Snowflake or Databricks.

Whether bigger is better remains to be seen, but Fivetran is betting that it will be in this case as it makes its way along the startup journey. The transaction has been approved by both companies’ boards. The deal is still subject to standard regulatory approval, but Fivetran is expecting it to close in October.

Sep
17
2021
--

Zoom looks beyond video conferencing as triple-digit 2020 growth begins to slow

It’s been a heady 12-18 months for Zoom, the decade-old company that experienced monster 2020 growth and more recently, a mega acquisition with the $14.7 billion Five9 deal in July. That addition is part of a broader strategy the company has been undertaking the last couple of years to move beyond its core video conferencing market into adjacencies like phone, meeting management and messaging, among other things. Here’s a closer look at how the plan is unfolding.

As the pandemic took hold in March 2020, everyone from businesses to schools to doctors and and places of worship moved online. As they did, Zoom video conferencing became central to this cultural shift and the revenue began pouring in, ushering in a period of sustained triple-digit growth for the company that only recently abated.

Sep
17
2021
--

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous Replication

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous Replication

Migration of a MySQL Database to a Kubernetes Cluster Using Asynchronous ReplicationNowadays, more and more companies are thinking about the migration of their infrastructure to Kubernetes. Databases are no exception. There were a lot of k8s operators that were created to simplify the different types of deployments and also perform routine day-to-day tasks like making the backups, renewing certificates, and so on.  If a few years ago nobody wanted to even listen about running databases in Kubernetes,  everything has changed now.

At Percona, we created a few very featureful k8s operators for Percona Server for MongoDB, PostgreSQL, and MySQL databases. Today we will talk about using cross-site replication – a new feature that was added to the latest release of Percona Distribution for MySQL Operator. This feature is based on synchronous connection failover mechanism.
The cross-site replication involves configuring one Percona XtraDB Cluster or a single/several MySQL servers as Source, and another Percona XtraDB Cluster (PXC) as a replica to allow asynchronous replication between them.  If an operator has several sources in custom resource (CR), it will automatically handle connection failure of the source DB.
This cross-site replication feature is supported only since MySQL 8.0.23, but you can read about migrating MySQL of earlier versions in this blog post.

The Goal

Migrate the MySQL database, which is deployed on-prem or in the cloud, to the Percona Distribution for MySQL Operator using asynchronous replication. This approach helps you reduce downtime and data loss for your application.

So, we have the following setup:

Migration of MySQL database to Kubernetes cluster using asynchronous replication

The following components are used:

1. MySQL 8.0.23 database (in my case it is Percona Server for MySQL) which is deployed in DO (as a Source) and Percona XtraBackup for the backup. In my test deployment, I use only one server as a Source to simplify the deployment. Depending on your topology of DB deployment, you can use several servers to use synchronous connection failover mechanism on the operator’s end.

2. Google Kubernetes Engine (GKE) cluster where Percona Distribution for MySQL Operator is deployed with PXC cluster (as a target).

3. AWS S3 bucket is used to save the backup from MySQL DB and then to restore the PXC cluster in k8s.

The following steps should be done to perform the migration procedure:

1. Make the MySQL database backup using Percona XtraBackup and upload it to the S3 bucket using xbcloud.

2. Perform the restore of the MySQL database from the S3 bucket into the PXC cluster which is deployed in k8s.

3. Configure asynchronous replication between MySQL server and PXC cluster managed by k8s operator.

As a result, we have asynchronous replication between MySQL server and PXC cluster in k8s which is in read-only mode.

Migration

Configure the target PXC cluster managed by k8s operator:

1. Deploy Percona Distribution for MySQL Operator on Kubernetes (I have used GKE 1.20).

# clone the git repository
git clone -b v1.9.0 https://github.com/percona/percona-xtradb-cluster-operator
cd percona-xtradb-cluster-operator

# deploy the operator
kubectl apply -f deploy/bundle.yaml

2. Create PXC cluster using the default custom resource manifest (CR).

# create my-cluster-secrets secret (do no use default passwords for production systems)
kubectl apply -f deploy/secrets.yaml

# create cluster by default it will be PXC 8.0.23
kubectl apply -f deploy/cr.yaml

3. Create the secret with credentials for the AWS S3 bucket which will be used for access to the S3 bucket during the restoration procedure.

# create S3-secret.yaml file with following content, and use correct credentials instead of XXXXXX

apiVersion: v1
kind: Secret
metadata:
  name: aws-s3-secret
type: Opaque
data:
  AWS_ACCESS_KEY_ID: XXXXXX
  AWS_SECRET_ACCESS_KEY: XXXXXX

# create secret
kubectl apply -f S3-secret.yaml

Configure the Source MySQL Server

1. Install Percona Server for MySQL 8.0.23 and Percona XtraBackup for the backup. Refer to the Installing Percona Server for MySQL and Installing Percona XtraBackup chapters in the documentation for installation instructions.


NOTE:
You need to add the following options to my.cnf to enable GTID support; otherwise, replication will not work because it is used by the PXC cluster  by default.

[mysqld]
enforce_gtid_consistency=ON
gtid_mode=ON

2. Create all needed users who will be used by k8s operator, the password should be the same as in

deploy/secrets.yaml

. Also, please note that the password for the root user should be the same as in deploy/secrets.yaml file for k8s the secret.  In my case, I used our default passwords from

deploy/secrets.yaml

file.

CREATE USER 'monitor'@'%' IDENTIFIED BY 'monitory' WITH MAX_USER_CONNECTIONS 100;
GRANT SELECT, PROCESS, SUPER, REPLICATION CLIENT, RELOAD ON *.* TO 'monitor'@'%';
GRANT SERVICE_CONNECTION_ADMIN ON *.* TO 'monitor'@'%';

CREATE USER 'operator'@'%' IDENTIFIED BY 'operatoradmin';
GRANT ALL ON *.* TO 'operator'@'%' WITH GRANT OPTION;

CREATE USER 'xtrabackup'@'%' IDENTIFIED BY 'backup_password';
GRANT ALL ON *.* TO 'xtrabackup'@'%';

CREATE USER 'replication'@'%' IDENTIFIED BY 'repl_password';
GRANT REPLICATION SLAVE ON *.* to 'replication'@'%';
FLUSH PRIVILEGES;

2. Make the backup of MySQL database using XtraBackup tool and upload it to S3 bucket.

# export aws credentials
export AWS_ACCESS_KEY_ID=XXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXX

#make the backup
xtrabackup --backup --stream=xbstream --target-dir=/tmp/backups/ --extra-lsndirk=/tmp/backups/  --password=root_password | xbcloud put --storage=s3 --parallel=10 --md5 --s3-bucket="mysql-testing-bucket" "db-test-1"

Now, everything is ready to perform the restore of the backup on the target database. So, let’s get back to our k8s cluster.

Configure the Asynchronous Replication to the Target PXC Cluster

If you have a completely clean source database (without any data), you can skip the points connected with backup and restoration of the database. Otherwise, do the following:

1. Restore the backup from the S3 bucket using the following manifest:

# create restore.yml file with following content

apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBClusterRestore
metadata:
  name: restore1
spec:
  pxcCluster: cluster1
  backupSource:
    destination: s3://mysql-testing-bucket/db-test-1
    s3:
      credentialsSecret: aws-s3-secret
      region: us-east-1

# trigger the restoration procedure
kubectl apply -f restore.yml

As a result, you will have a PXC cluster with data from the source DB. Now everything is ready to configure the replication.

2. Edit custom resource manifest

deploy/cr.yaml

  to configure

spec.pxc.replicationChannels

 section.

spec:
  ...
  pxc:
    ...
    replicationChannels:
    - name: ps_to_pxc1
      isSource: false
      sourcesList:
        - host: <source_ip>
          port: 3306
          weight: 100

# apply CR
kubectl apply -f deploy/cr.yaml


Verify the Replication 

In order to check the replication, you need to connect to the PXC node and run the following commands:

kubectl exec -it cluster1-pxc-0 -c pxc -- mysql -uroot -proot_password -e "show replica status \G"
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: <ip-of-source-db>
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000004
          Read_Master_Log_Pos: 529
               Relay_Log_File: cluster1-pxc-0-relay-bin-ps_to_pxc1.000002
                Relay_Log_Pos: 738
        Relay_Master_Log_File: binlog.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 529
              Relay_Log_Space: 969
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 9741945e-148d-11ec-89e9-5ee1a3cf433f
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 3
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 9741945e-148d-11ec-89e9-5ee1a3cf433f:1-2
            Executed_Gtid_Set: 93f1e7bf-1495-11ec-80b2-06e6016a7c3d:1,
9647dc03-1495-11ec-a385-7e3b2511dacb:1-7,
9741945e-148d-11ec-89e9-5ee1a3cf433f:1-2
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: ps_to_pxc1
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 0
            Network_Namespace:

Also, you can verify the replication by checking that the data is changing.

Promote the PXC Cluster as a Primary

As soon as you are ready (your application was reconfigured and ready to work with the new DB) to stop the replication and promote the PXC cluster in k8s to be a primary DB, you need to perform the following simple actions:

1. Edit the

deploy/cr.yaml

  and comment the replicationChannels

spec:
  ...
  pxc:
    ...
    #replicationChannels:
    #- name: ps_to_pxc1
    #  isSource: false
    #  sourcesList:
    #    - host: <source_ip>
    #      port: 3306
    #      weight: 100

2. Stop mysqld service on the source server to be sure that no new data is written.

 systemctl stop mysqld

3. Apply a new CR for k8s operator.

# apply CR
kubectl apply -f deploy/cr.yaml

As a result, replication is stopped and the read-only mode is disabled for the PXC cluster.

Conclusion

Technologies are changing so fast that a migration procedure to k8s cluster, seeming very complex at first sight, turns out to be not so difficult and nor time-consuming. But you need to keep in mind that significant changes were made. Firstly, you migrate the database to the PXC cluster which has some peculiarities, and, of course, Kubernetes itself.  If you are ready, you can start the journey to Kubernetes right now.

The Percona team is ready to guide you during this journey. If you have any questions,  please raise the topic in the community forum.

The Percona Kubernetes Operators automate the creation, alteration, or deletion of members in your Percona Distribution for MySQL, MongoDB, or PostgreSQL environment.

Learn More About Percona Kubernetes Operators

Sep
16
2021
--

Mirantis launches cloud-native data center-as-a-service software

Mirantis has been around the block, starting way back as an OpenStack startup, but a few years ago the company began to embrace cloud-native development technologies like containers, microservices and Kubernetes. Today, it announced Mirantis Flow, a fully managed open source set of services designed to help companies manage a cloud-native data center environment, whether your infrastructure lives on-prem or in a public cloud.

“We’re about delivering to customers an open source-based cloud-to-cloud experience in the data center, on the edge, and interoperable with public clouds,” Adrian Ionel, CEO and co-founder at Mirantis explained.

He points out that the biggest companies in the world, the hyperscalers like Facebook, Netflix and Apple, have all figured out how to manage in a hybrid cloud-native world, but most companies lack the resources of these large organizations. Mirantis Flow is aimed at putting these same types of capabilities that the big companies have inside these more modest organizations.

While the large infrastructure cloud vendors like Amazon, Microsoft and Google have been designed to help with this very problem, Ionel says that these tend to be less open and more proprietary. That can lead to lock-in, which today’s large organizations are looking desperately to avoid.

“[The large infrastructure vendors] will lock you into their stack and their APIs. They’re not based on open source standards or technology, so you are locked in your single source, and most large enterprises today are pursuing a multi-cloud strategy. They want infrastructure flexibility,” he said. He added, “The idea here is to provide a completely open and flexible zero lock-in alternative to the [big infrastructure providers, but with the] same cloud experience and same pace of innovation.”

They do this by putting together a stack of open source solutions in a single service. “We provide virtualization on top as part of the same fabric. We also provide software-defined networking, software-defined storage and CI/CD technology with DevOps as a service on top of it, which enables companies to automate the entire software development pipeline,” he said.

As the company describes the service in a blog post published today, it includes “Mirantis Container Cloud, Mirantis OpenStack and Mirantis Kubernetes Engine, all workloads are available for migration to cloud native infrastructure, whether they are traditional virtual machine workloads or containerized workloads.”

For companies worried about migrating their VMware virtual machines to this solution, Ionel says they have been able to move these VMs to the Mirantis solution in early customers. “This is a very, very simple conversion of the virtual machine from VMware standard to an open standard, and there is no reason why any application and any workload should not run on this infrastructure — and we’ve seen it over and over again in many many customers. So we don’t see any bottlenecks whatsoever for people to move right away,” he said.

It’s important to note that this solution does not include hardware. It’s about bringing your own hardware infrastructure, either physical or as a service, or using a Mirantis partner like Equinix. The service is available now for $15,000 per month or $180,000 annually, which includes: 1,000 core/vCPU licenses for access to all products in the Mirantis software suite plus support for 20 virtual machine (VM) migrations or application onboarding and unlimited 24×7 support. The company does not charge any additional fees for control plane and management software licenses.

Sep
16
2021
--

Confluent CEO Jay Kreps is coming to TC Sessions: SaaS for a fireside chat

As companies process ever-increasing amounts of data, moving it in real time is a huge challenge for organizations. Confluent is a streaming data platform built on top of the open source Apache Kafka project that’s been designed to process massive numbers of events. To discuss this, and more, Confluent CEO and co-founder Jay Kreps will be joining us at TC Sessions: SaaS on Oct 27th for a fireside chat.

Data is a big part of the story we are telling at the SaaS event, as it has such a critical role in every business. Kreps has said in the past the data streams are at the core of every business, from sales to orders to customer experiences. As he wrote in a company blog post announcing the company’s $250 million Series E in April 2020, Confluent is working to process all of this data in real time — and that was a big reason why investors were willing to pour so much money into the company.

“The reason is simple: though new data technologies come and go, event streaming is emerging as a major new category that is on a path to be as important and foundational in the architecture of a modern digital company as databases have been,” Kreps wrote at the time.

The company’s streaming data platform takes a multi-faceted approach to streaming and builds on the open source Kafka project. While anyone can download and use Kafka, as with many open source projects, companies may lack the resources or expertise to deal with the raw open source code. Many a startup have been built on open source to help simplify whatever the project does, and Confluent and Kafka are no different.

Kreps told us in 2017 that companies using Kafka as a core technology include Netflix, Uber, Cisco and Goldman Sachs. But those companies have the resources to manage complex software like this. Mere mortal companies can pay Confluent to access a managed cloud version or they can manage it themselves and install it in the cloud infrastructure provider of choice.

The project was actually born at LinkedIn in 2011 when their engineers were tasked with building a tool to process the enormous number of events flowing through the platform. The company eventually open sourced the technology it had created and Apache Kafka was born.

Confluent launched in 2014 and raised over $450 million along the way. In its last private round in April 2020, the company scored a $4.5 billion valuation on a $250 million investment. As of today, it has a market cap of over $17 billion.

In addition to our discussion with Kreps, the conference will also include Google’s Javier Soltero, Amplitude’s Olivia Rose, as well as investors Kobie Fuller and Casey Aylward, among others. We hope you’ll join us. It’s going to be a thought-provoking lineup.

Buy your pass now to save up to $100 when you book by October 1. We can’t wait to see you in October!


Sep
16
2021
--

Fiberplane nabs €7.5M seed to bring Google Docs-like collaboration to incident response

Fiberplane, an Amsterdam-based early-stage startup that is building collaborative notebooks for SREs (site reliability engineers) to collaborate around an incident in a similar manner to group editing in a Google Doc, announced a ??€7.5 million (approximately $8.8 million USD) seed round today.

The round was co-led by Crane Venture Partners and Notion Capital, with participation from Northzone, System.One and Basecase Capital.

Micha Hernandez van Leuffen (known as Mies) is founder and CEO at Fiberplane. When his previous startup, Werker, was sold to Oracle in 2017, Hernandez van Leuffen became part of a much larger company where he saw people struggling to deal with outages (which happen at every company).

“We were always going back and forth between metrics, logs and traces, what I always call this sort of treasure hunt, and figuring out what was the underlying root cause of an outage or downtime,” Hernandez van Leuffen told me.

He said that this experience led to a couple of key insights about incident response: First, you needed a centralized place to pull all the incident data together, and secondly that as a distributed team managing a distributed system you needed to collaborate in real time, often across different time zones.

When he left Oracle in August 2020, he began thinking about the idea of giving DevOps teams and SREs the same kind of group editing capabilities that other teams inside an organization have with tools like Google Docs or Notion and an idea for his new company began to take shape.

What he created with Fiberplane is a collaborative notebook for SRE’s to pull in the various data types and begin to work together to resolve the incident, while having a natural audit trail of what happened and how they resolved the issue. Different people can participate in this notebook, just as multiple people can edit a Google Doc, fulfilling that original vision.

Fiberplane incident response notebook with various types of data about the incident.

Fiberplane collaborative notebook example with multiple people involved. Image Credit: Fiberplane

He doesn’t plan to stop there though. The longer-term vision is an operational platform for SREs and DevOps teams to deal with every aspect of an outage. “This is our starting point, but we are planning to expand from there as more I would say an SRE workbench, where you’re also able to command and control your infrastructure,” he said.

Today the company has 13 employees and is growing, and as they do, they are exploring ways to make sure they are building a diverse company, looking at concrete strategies to find more diverse candidates.

“To hire diversely, we’re re-examining our top of the funnel processes. Our efforts include posting our jobs in communities of underrepresented people, running our job descriptions through a gender decoder and facilitating a larger time frame for jobs to remain open,” Elena Boroda, marketing manager at Fiberplane said.

While Hernandez van Leuffen is based in Amsterdam, the company has been hiring people in the U.K., Berlin, Copenhagen and the U.S., he said. The plan is to have Amsterdam as a central hub when offices reopen as the majority of employees are located there.

Sep
16
2021
--

Salesforce announces new MuleSoft RPA tool based on Servicetrace acquisition

When Salesforce announce0d it was buying German RPA vendor Servicetrace last month, it seemed that it might match up well with MuleSoft, the company the CRM giant bought in 2018 for $6.5 billion. MuleSoft, among other things, helps customers build APIs to legacy systems, while Servicetrace provides a way to add automation to legacy systems. Sure enough, the company announced today that it’s planning a new MuleSoft-Servicetrace tool called MuleSoft RPA.

The Servicetrace deal closed on September 2nd and the company isn’t wasting any time putting it to work wherever it makes sense across the organization — and the MuleSoft integration is a primary use case. John Kucera, SVP of product management at Salesforce where he leads product automation, says that MuleSoft has API management and integration tooling already, but RPA will add another dimension to those existing capabilities.

“We found that many of our customers also need to automate and integrate with disconnected systems, with PDFs, with spreadsheets, but also these legacy systems that don’t have events or API’s. And so we wanted to make sure that we can meet our customers where they are, and that we could have this end-to-end, solution to automate these capabilities,” Kucera told me.

The company will be packaging ServiceTrace as a part of MuleSoft, while blending it with other parts of the Salesforce family of integration tools, as well as other parts of the platform. The MuleSoft RPA tool will live under the Einstein Automate umbrella, but MuleSoft will also sell it as a standalone service, so customers can take advantage of it, even if they aren’t using other parts of the MuleSoft platform or even the broader Salesforce platform. Einstein is the name of Salesforce’s artificial intelligence capabilities. Although RPA isn’t really AI, it can become integrated into an AI-fueled workflow like this.

The MuleSoft acquisition always seemed to be about giving Salesforce, a fully cloud company at its core, a way to access on-prem, legacy enterprise systems, allowing customers to reach data wherever it lives. Adding RPA to the mix takes that a step further, enabling companies to build connections to these systems inside their more modern Einstein Automate workflow tooling to systems that previously wouldn’t have been accessible to the Einstein Automate system.

This is often the case for many large companies, which typically use a mix of newer and often very old systems. Giving them a way to link the two and bring automation across the company could prove quite useful if it truly works as described.

The company is announcing all of these capabilities at Dreamforce, its annual customer conference taking place next week. As with many announcements at the conference, this one is designed to let customers know what’s coming, rather than something that’s available now (or at least soon). MuleSoft RPA is not expected to be ready for general availability until some time in the first half of next year.

Sep
15
2021
--

Ascend raises $5.5M to provide a BNPL option for commercial insurance

Ascend on Wednesday announced a $5.5 million seed round to further its insurance payments platform that combines financing, collections and payables.

First Round Capital led the round and was joined by Susa Ventures, FirstMark Capital, Box Group and a group of angel investors, including Coalition CEO Joshua Motta, Newfront Insurance executives Spike Lipkin and Gordon Wintrob, Vouch Insurance CEO Sam Hodges, Layr Insurance CEO Phillip Naples, Anzen Insurance CEO Max Bruner, Counterpart Insurance CEO Tanner Hackett, former Bunker Insurance CEO Chad Nitschke, SageSure executive Paul VanderMarck, Instacart co-founders Max Mullen and Brandon Leonardo and Houseparty co-founder Ben Rubin.

This is the first funding for the company that is live in 20 states. It developed payments APIs to automate end-to-end insurance payments and to offer a buy now, pay later financing option for distribution of commissions and carrier payables, something co-founder and co-CEO Andrew Wynn, said was rather unique to commercial insurance.

Wynn started the company in January 2021 with his co-founder Praveen Chekuri after working together at Instacart. They originally started Sheltr, which connected customers with trained maintenance professionals and was acquired by Hippo in 2019. While working with insurance companies they recognized how fast the insurance industry was modernizing, yet insurance sellers still struggled with customer experiences due to outdated payments processes. They started Ascend to solve that payments pain point.

The insurance industry is largely still operating on pen-and-paper — some 600 million paper checks are processed each year, Wynn said. He referred to insurance as a “spaghetti web of money movement” where payments can take up to 100 days to get to the insurance carrier from the customer as it makes its way through intermediaries. In addition, one of the only ways insurance companies can make a profit is by taking those hundreds of millions of dollars in payments and investing it.

Home and auto insurance can be broken up into payments, but the commercial side is not as customer friendly, Wynn said. Insurance is often paid in one lump sum annually, though, paying tens of thousands of dollars in one payment is not something every business customer can manage. Ascend is offering point-of-sale financing to enable insurance brokers to break up those commercial payments into monthly installments.

“Insurance carriers continue to focus on annual payments because they don’t have a choice,” he added. “They want all of their money up front so they can invest it. Our platform not only reduces the friction with payments by enabling customers to pay how they want to pay, but also helps carriers sell more insurance.”

Ascend app

Startups like Ascend aiming to disrupt the insurance industry are also attracting venture capital, with recent examples including Vouch and Marshmallow, which raised close to $100 million, while Insurify raised $100 million.

Wynn sees other companies doing verticalized payment software for other industries, like healthcare insurance, which he says is a “good sign for where the market is going.” This is where Wynn believes Ascend is competing, though some incumbents are offering premium financing, but not in the digital way Ascend is.

He intends to deploy the new funds into product development, go-to-market initiatives and new hires for its locations in New York and Palo Alto. He said the raise attracted a group of angel investors in the industry, who were looking for a product like this to help them sell more insurance versus building it from scratch.

Having only been around eight months, it is a bit early for Ascend to have some growth to discuss, but Wynn said the company signed its first customer in July and six more in the past month. The customers are big digital insurance brokerages and represent, together, $2.5 billion in premiums. He also expects to get licensed to operate as a full payment in processors in all states so the company can be in all 50 states by the end of the year.

The ultimate goal of the company is not to replace brokers, but to offer them the technology to be more efficient with their operations, Wynn said.

“Brokers are here to stay,” he added. “What will happen is that brokers who are tech-enabled will be able to serve customers nationally and run their business, collect payments, finance premiums and reduce backend operation friction.”

Bill Trenchard, partner at First Round Capital, met Wynn while he was still with Sheltr. He believes insurtech and fintech are following a similar story arc where disruptive companies are going to market with lower friction and better products and, being digital-first, are able to meet customers where they are.

By moving digital payments over to insurance, Ascend and others will lead the market, which is so big that there will be many opportunities for companies to be successful. The global commercial insurance market was valued at $692.33 billion in 2020, and expected to top $1 trillion by 2028.

Like other firms, First Round looks for team, product and market when it evaluates a potential investment and Trenchard said Ascend checked off those boxes. Not only did he like how quickly the team was moving to create momentum around themselves in terms of securing early pilots with customers, but also getting well known digital-first companies on board.

“The magic is in how to automate the underwriting, how to create a data moat and be a first mover — if you can do all three, that is great,” Trenchard said. “Instant approvals and using data to do a better job than others is a key advantage and is going to change how insurance is bought and sold.”

Sep
14
2021
--

AgBiome lands $116M for safer crop protection technology

AgBiome, developing products from microbial communities, brought in a $116 million Series D round as the company prepares to pad its pipeline with new products.

The company, based in Research Triangle Park, N.C., was co-founded in 2012 by a group including co-CEOs Scott Uknes and Eric Ward, who have known each other for over 30 years. They created the Genesis discovery platform to capture diverse microbes for agricultural applications, like crop protection, and screen the strains for the best assays that would work for insect, disease and nematode control.

“The microbial world is immense,” said Uknes, who explained that there is estimated to be a trillion microbes, but only 1% have been discovered. The microbes already discovered are used by humans for things like pharmaceuticals, food and agriculture. AgBiome built its database in Genesis to house over 100,000 microbes and every genome in every microbe was sequenced into hundreds of strains.

The company randomly selects strains and looks for the best family of strains with a certain activity, like preventing fungus on strawberries, and creates the product.

AgBiome co-CEOs Scott Uknes and Eric Ward. Image Credits: AgBiome

Its first fungicide product, Howler, was launched last year and works on more than 300 crop-disease combinations. The company saw 10x sales growth in 2020, Uknes told TechCrunch. As part of farmers’ integrated pest program, they often spray fungicide applications 12 times per year in order to yield fruits and vegetables.

Due to its safer formula, Howler can be used as the last spray in the program, and its differentiator is a shorter re-entry period — farmers can spray in the morning and be able to go back out in the field in the afternoon. It also has a shorter pre-harvest time of four hours after application. Other fungicides on the market today require seven days before re-entry and pre-harvest, Uknes explained.

AgBiome aims to add a second fungicide product, Theia, in early 2022, while a third, Esendo was submitted for Environmental Protection Agency registration. Uknes expects to have 11 products, also expanding into insecticides and herbicides, by 2025.

The oversubscribed Series D round was co-led by Blue Horizon and Novalis LifeSciences and included multiple new and existing investors. The latest investment gives AgBiome over $200 million in total funding to date. The company’s last funding round was a $65 million Series C raised in 2018.

While competitors in synthetic biology often sell their companies to someone who can manufacture their products, Uknes said AgBiome decided to manufacture and commercialize the products itself, something he is proud of his team for being able to do.

“We want to feed the world responsibly, and these products have the ability to substitute for synthetic chemicals and provide growers a way to protect their crops, especially as consumers want natural, sustainable tools,” he added.

The company has grown to over 100 employees and will use the new funding to accelerate production of its two new products, building out its manufacturing capacity in North America and expanding its footprint internationally. Uknes anticipates growing its employee headcount to 300 in the next five years.

AgBiome anticipates rolling up some smaller companies that have a product in production to expand its pipeline in addition to its organic growth. As a result, Uknes said he was particular about the kind of investment partners that would work best toward that goal.

Przemek Obloj, managing partner at Blue Horizon, was introduced to the company by existing investors. His firm has an impact fund focused on the future of food and began investing in alternative proteins in 2016 before expanding that to delivery systems in agriculture technology, he said.

Obloj said AgBiome is operating in a $60 billion market where the problems include products that put toxic chemicals into the ground that end up in water systems. While the solution would be to not do that, not doing that would mean produce doesn’t grow as well, he added.

The change in technology in agriculture is enabling Uknes and Ward to do something that wasn’t possible 10 years ago because there was not enough compute or storage power to discover and sequence microbes.

“We don’t want to pollute the Earth, but we have to find a way to feed 9 billion people by 2050,” Obloj said. “With AgBiome, there is an alternative way to protect crops than by polluting the Earth or having health risks.”

Sep
13
2021
--

MySQL/ZFS in the Cloud, Leveraging Ephemeral Storage

MySQL/ZFS in the cloud

MySQL/ZFS in the cloudHere’s a second post focusing on the performance of MySQL on ZFS in cloud environments. In the first post, MySQL/ZFS Performance Update, we compared the performances of ZFS and ext4. This time we’ll look at the benefits of using ephemeral storage devices. These devices, called ephemeral in AWS, local in Google cloud, and temporary in Azure, are provided directly by the virtualization host. They are not network-attached and are not IO throttled, at least compared to regular storage. Not only can they handle a high number of IOPs, but their IO latency is also very low. For simplicity, we’ll name these devices local ephemeral. They can be quite large: Azure lsv2, Google Cloud n2, and AWS i3 instance types offer TBs of fast NVMe local ephemeral storage.

The main drawback of local ephemeral devices is the loss of all the data if the VM is terminated. For that reason, the usage of local ephemeral devices is limited with databases like MySQL. Typical use cases are temporary reporting servers and Percona XtraDB Cluster (PXC)/Galera cluster nodes. PXC is a bit of a wild case here: the well polished and automated full state transfer of Galera overcomes the issue caused by having to reload the dataset when a cluster node is recycled. Because of data compression, much more data can be stored on an ephemeral device. Actually, our TPCC dataset fits on the 75GB of temporary storage when compressed. Under such circumstances, the TPCC performance is stellar as shown below.

TPCC Transation Rate ZFS

TPCC results using ZFS on an ephemeral device

On the local ephemeral device, the TPCC transaction rate is much higher, hovering close to 200 per minute. The ZFS results on the regular SSD Premium are included as a reference. The transaction rate during the last hour was around 50 per minute. Essentially, with the use of the local ephemeral device, the load goes from IO-bound to CPU-bound.

Of course, it is not always possible to only use ephemeral devices. We’ll now explore a use case for an ephemeral device, as a caching device for the filesystem, using the ZFS L2ARC.

What is the ZFS L2ARC?

Like all filesystems, ZFS has a memory cache, called the ARC, to prevent disk IOPs from retrieving frequently used pieces of data. The ZFS ARC has a few additional tricks up its sleeve. First, when data compression is used on the filesystem, the compressed form is stored in the ARC. This helps store more data. The second ZFS trick is the ability to connect the ARC LRU eviction to a fast storage device, the L2ARC. L2 stands for “Level 2”, a bit like the leveled caches of CPUs.

Essentially, the ZFS ARC is a level 1 cache, and records evicted from it can be inserted into a level 2 cache, the L2ARC. For the L2ARC to be efficient, the device used must have a low latency and be able to perform a high number of IOPs. Those are characteristics of cloud ephemeral devices.

Configuration for the L2ARC

The ZFS L2ARC has many tunables and many of these have been inherited from the recent past when flash devices were much slower for writes than for reads. So, let’s start by the beginning, here is how we add a L2ARC using the local ephemeral device, /dev/sdb to the ZFS pool bench:

# zpool add bench cache /dev/sdb

Then, the cache device appears in the zpool:

# zpool status
       pool: bench
      state: ONLINE
     config:
         NAME      STATE  READ WRITE CKSUM
         bench     ONLINE        0      0      0
           sdc     ONLINE        0      0      0
         cache
           sdb     ONLINE        0      0      0

Once the L2ARC is created, if we want data in it, we must start storing data in the ARC with:

# zfs set primarycache=all bench/data

This is all that is needed to get data flowing to the L2ARC, but the default parameters controlling the L2ARC have conservative values and it can be quite slow to warm up the L2ARC. In order to improve the L2ARC performance, I modified the following kernel module parameters:

l2arc_headroom=4
l2arc_write_boost=134217728
l2arc_write_max=67108864
zfs_arc_max=4294967296

Essentially, I am boosting the ingestion rate of the L2ARC. I am also slightly increasing the size of the ARC because the pointers to the L2ARC data are kept in the ARC. If you don’t use a large enough ARC, you won’t be able to add data to the L2ARC. That ceiling frustrated me a few times until I realized the entry l2_hdr_size in /proc/spl/kstat/zfs/arcstats is data stored in the metadata section of the ARC. The ARC must be large enough to accommodate the L2ARC pointers.

L2ARC Impacts on TPCC Results

So, what happens to the TPCC transaction rate when we add a L2ARC? Since we copy the dataset is copied over every time, the L2ARC is fully warm at the beginning of a run. The figure below shows the ZFS results with and without a L2ARC in front of SSD premium Azure storage.

TPCC performance on ZFS with a L2ARC

TPCC performance on ZFS with a L2ARC

The difference is almost incredible. Since the whole compressed dataset fits into the L2ARC, the behavior is somewhat similar to the direct use of the local ephemeral device. Actually, since the write load is now sent to the SSD premium storage, the performance is even higher. However, after 4000s, the performance starts to degrade.

From what I found, this is caused by the thread feeding the L2ARC (l2arc_feed). As pages are updated by the TPCC workload, they are eventually flushed at a high rate to the storage. The L2ARC feed thread has to scan the ARC LRU to find suitable records before they are evited. This thread then writes it to the local ephemeral device, and updates the pointers in the ARC. Even if the write latency of the local ephemeral device is low, it is significant and it greatly limits the amount of work a single feed thread can do. Ideally, ZFS should be able to use more than a single L2ARC feed thread.

In the event you end up in such a situation with a degraded L2ARC, you can refresh it when the write load goes down. Just run the following command when activity is low:

# tar c /var/lib/mysql/data > /dev/null

It is important to keep in mind that a read-intensive or a moderately write-intensive workload will not degrade as much over time as the TPCC benchmark used here. Essentially, if a replica with one of even a few (2 or 3) replication threads can keep up with the write load, the ZFS L2ARC feed thread will also be able to keep up.

Comparison with bcache

The ZFS L2ARC is not the only option to use a local ephemeral device as a read cache; there are other options like bcache and flashcache. Since bcache is now part of the Linux kernel, we’ll focus on it.

bcache is used as an ext4 read cache extension. Its content is uncompressed, unlike the L2ARC. The dataset is much larger than the size of the local ephemeral device so the impacts are expected to be less important.

Comparison of the TPCC transaction rate between bcache and L2ARC

As we can see in the above figure, it is exactly what we observe. The transaction rate with bcache is inferior to L2ARC because less data is cached. The L2ARC yielded more than twice the number of transactions over the 2h period than bcache. However, bcache is not without merit, it did help ext4 increase its performance by about 43%.

How to Recreate L2ARC if Missing

By nature, local ephemeral devices are… ephemeral. When a virtual machine is restarted, it could end up on a different host. In such a case, the L2ARC data on the local ephemeral device is lost. Since it is only a read cache, it doesn’t prevent ZFS from starting but you get a pool status similar to this:

# zpool status
  pool: bench
 state: ONLINE
status: One or more devices could not be opened.  Sufficient replicas exist for
    	the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://zfsonlinux.org/msg/ZFS-8000-2Q
  scan: none requested
config:

        NAME          	STATE 	READ WRITE CKSUM
    	bench        	ONLINE   	0 	0 	0
      	sdc         	ONLINE   	0 	0 	0
    	cache
      	/dev/sdb        UNAVAIL  	0 	0 	0  cannot open

In such case, the L2ARC can be easily be fixed with:

# zpool remove bench /dev/sdb
# zpool add bench cache /dev/sdb

These commands should be called from a startup script to ensure the L2ARC is sane after a restart.

Conclusion

In this post, we have explored the great potential of local ephemeral devices. These devices are means to improve MySQL performance and reduce the costs of cloud hosting. Either used directly or as a caching device, ZFS data compression and architecture allow nearly triple the number of TPCC transactions executed over a 2 hours period.

There are still a few ZFS related topics I’d like to cover in the near future. Those posts may not be in that order but the topics are: “Comparison with InnoDB compression”, “Comparison with BTRFS”, “ZFS tuning for MySQL”. If some of these titles raise your interest, stay tuned.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure, open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Download Percona Distribution for MySQL Today

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com