3 keys to pricing early-stage SaaS products

I’ve met hundreds of founders over the years, and most, particularly early-stage founders, share one common go-to-market gripe: Pricing.

For enterprise software, traditional pricing methods like per-seat models are often easier to figure out for products that are hyperspecific, especially those used by people in essentially the same way, such as Zoom or Slack. However, it’s a different ballgame for startups that offer services or products that are more complex.

Most startups struggle with a per-seat model because their products, unlike Zoom and Slack, are used in a litany of ways. Salesforce, for example, employs regular seat licenses and admin licenses — customers can opt for lower pricing for solutions that have low-usage parts — while other products are priced based on negotiation as part of annual renewals.

You may have a strong champion in a CIO you’re selling to or a very friendly person handling procurement, but it won’t matter if the pricing can’t be easily explained and understood. Complicated or unclear pricing adds more friction.

Early pricing discussions should center around the buyer’s perspective and the value the product creates for them. It’s important for founders to think about the output and the outcome, and a number they can reasonably defend to customers moving forward. Of course, self-evaluation is hard, especially when you’re asking someone else to pay you for something you’ve created.

This process will take time, so here are three tips to smoothen the ride.

Pricing is a journey

Pricing is not a fixed exercise. The enterprise software business involves a lot of intangible aspects, and a software product’s perceived value, quality, and user experience can be highly variable.

The pricing journey is long and, despite what some founders might think, jumping headfirst into customer acquisition isn’t the first stop. Instead, step one is making sure you have a fully fledged product.

If you’re a late-seed or Series A company, you’re focused on landing those first 10-20 customers and racking up some wins to showcase in your investor and board deck. But when you grow your organization to the point where the CEO isn’t the only person selling, you’ll want to have your go-to-market position figured out.

Many startups fall into the trap of thinking: “We need to figure out what pricing looks like, so let’s ask 50 hypothetical customers how much they would pay for a solution like ours.” I don’t agree with this approach, because the product hasn’t been finalized yet. You haven’t figured out product-market fit or product messaging and you want to spend a lot of time and energy on pricing? Sure, revenue is important, but you should focus on finding the path to accruing revenue versus finding a strict pricing model.


Real-time database platform SingleStore raises $80M more, now at a $940M valuation

Organizations are swimming in data these days, and so solutions to help manage and use that data in more efficient ways will continue to see a lot of attention and business. In the latest development, SingleStore — which provides a platform to enterprises to help them integrate, monitor and query their data as a single entity, regardless of whether that data is stored in multiple repositories — is announcing another $80 million in funding, money that it will be using to continue investing in its platform, hiring more talent and overall business expansion. Sources close to the company tell us that the company’s valuation has grown to $940 million.

The round, a Series F, is being led by Insight Partners, with new investor Hewlett Packard Enterprise, and previous backers Khosla Ventures, Dell Technologies Capital, Rev IV, Glynn Capital and GV (formerly Google Ventures) also participating. The startup has to date raised $264 million, including most recently an $80 million Series E last December, just on the heels of rebranding from MemSQL.

The fact that there are three major strategic investors in this Series F — HPE, Dell and Google — may say something about the traction that SingleStore is seeing, but so too do its numbers: 300%+ increase in new customer acquisition for its cloud service and 150%+ year-over-year growth in cloud.

Raj Verma, SingleStore’s CEO, said in an interview that its cloud revenues have grown by 150% year over year and now account for some 40% of all revenues (up from 10% a year ago). New customer numbers, meanwhile, have grown by over 300%.

“The flywheel is now turning around,” Verma said. “We didn’t need this money. We’ve barely touched our Series E. But I think there has been a general sentiment among our board and management that we are now ready for the prime time. We think SingleStore is one of the best-kept secrets in the database market. Now we want to aggressively be an option for people looking for a platform for intensive data applications or if they want to consolidate databases to one from three, five or seven repositories. We are where the world is going: real-time insights.”

With database management and the need for more efficient and cost-effective tools to manage that becoming an ever-growing priority — one that definitely got a fillip in the last 18 months with COVID-19 pushing people into more remote working environments. That means SingleStore is not without competitors, with others in the same space, including Amazon, Microsoft, Snowflake, PostgreSQL, MySQL, Redis and more. Others like Firebolt are tackling the challenges of handing large, disparate data repositories from another angle. (Some of these, I should point out, are also partners: SingleStore works with data stored on AWS, Microsoft Azure, Google Cloud Platform and Red Hat, and Verma describes those who do compute work as “not database companies; they are using their database capabilities for consumption for cloud compute.”)

But the company has carved a place for itself with enterprises and has thousands now on its books, including GE, IEX Cloud, Go Guardian, Palo Alto Networks, EOG Resources and SiriusXM + Pandora.

“SingleStore’s first-of-a-kind cloud database is unmatched in speed, scale, and simplicity by anything in the market,” said Lonne Jaffe, managing director at Insight Partners, in a statement. “SingleStore’s differentiated technology allows customers to unify real-time transactions and analytics in a single database.” Vinod Khosla from Khosla Ventures added that “SingleStore is able to reduce data sprawl, run anywhere, and run faster with a single database, replacing legacy databases with the modern cloud.”


Performance of Various EBS Storage Types in AWS

EBS Storage Types in AWS

EBS Storage Types in AWSEBS storage type choices in AWS can be impacted by a lot of factors. As a consultant, I get a lot of questions about choosing the best storage type for a workload. Let me share a few examples. Is io2 better than gp2/3 if the configured iops are the same? What can I expect when upgrading gp2 to gp3?

In order to be able to answer questions like this, in this blog post, we will take a deeper look. We will compare storage devices that are “supposed to be the same”, in order to reveal the differences between these storage types. We will examine the following storage devices:

    1 TB gp2 volume (has 3000 iops by definition)
    1 TB gp3 volume, with the iops set to 3000
    1 TB io1 volume, with the iops set to 3000
    1 TB io2 volume, with the iops set to 3000

So, all the volumes are 1TB with 3000 iops, so in theory, they are the same. Also, in theory, theory and practice are the same, but in practice, they are different. Storage performance is more complex than just capacity and the number of iops, as we will see soon. Note that this test is very limited to draw conclusions like io1 is better than gp2 or anything like that in general. These devices have very different scalability characteristics (the io devices are scaling to 64k iops, while the maximum for the gp devices is 16k). Measuring the scalability of these devices and testing them in the long run and in different availability zones are out of scope for these tests. The reason I chose devices that have the same “specs” is to gain an understanding of the difference in their behavior. The tests were only run in a single availability zone (eu-west-1a).

For the tests, I used sysbench fileio, with the following prepare command.

sysbench --test=fileio \
--file-total-size=700G \
--threads=16 \
--file-num=64 \
--file-block-size=16384 \

The instances I used were r5.xlarge instances, which have up to 4750 Mbps bandwidth to EBS.

I used the following command to run the tests:

sysbench fileio \
--file-total-size=700G \
--time=1800 \
--max-requests=0 \
--threads=${th} \
--file-num=64 \
--file-io-mode=sync \
--file-test-mode=${test_mode} \
--file-extra-flags=direct \
--file-fsync-freq=0 \
--file-block-size=16384 \
--report-interval=1 \

In this command, the test mode can be rndwr (random writes only), rndrd (random reads only), and rndwr (random reads and writes mixed). The number of threads used were 1, 2, 4, 8, 16, 32, 64, and 128. All tests are using 16k io operations with direct io enabled (bypassing the filesystem cache), based on this, the peak theoretical throughput of the tests is 16k*3000 = 48 MB/s.

Random Writes

sysbench random writes

The gp2 and io1 devices reached the peak throughput for this benchmark with 4 threads and the gp3 reached it with 2 threads (but with a larger variance). The io2 device has more consistent performance overall. The peak throughput in these tests is the expected peak throughput (16k*3000 iops = 46.8MB/sec).

sysbench random mixed read/write latency

At a low thread count, gp3 has the highest variation in latency, gp2’s performance is more consistent. The latencies of io1 and io2 are more consistent, especially io2 at a higher thread count.

This means if the workload is mostly writes:

– Prefer gp3 over gp2 (better performance, less price).
– Prefer io2 if the price is worth the consistency in performance at lower thread counts.
– If the workload is multithreaded, and there are always more than 4 threads, prefer gp3 (in this case, the performance is the same, gp3 is the cheapest option).

Random Reads

sysbench random reads

The random read throughput shows a much bigger difference than writes. First of all, the performance is more inconsistent in the case of gp2 and gp3, but gp2 seems to be slightly more consistent. The io2 device has the same consistent performance even with a single thread.

sysbench random read latency

Similarly, there is a much bigger variance in latency in the case of low thread counts between the gp2 and the gp3. Even at 64 threads, the io2 device has very consistent latency characteristics.

This means if the workload is mostly reads:

– The gp2 volumes can give slightly better performance, but they are also slightly more expensive.
– Above 16 parallel threads, the devices are fairly similar, prefer gp3 because of the price.
– Prefer io2 if performance and latency are important with a low thread count (even over io1).

Random Mixed Reads/Writes

random mixed reads/writes

The mixed workload behavior is similar to the random read one, so the variance in the read performance will also show as a variance in the write performance. The more reads are added to the mix, the inconsistent the performance will become with the gp2/gp3 volumes. The io1 volume reaches peak throughput even with two threads, but with a high variance.

In the case of the mixed workload, the gp3 has the least consistent performance. This can come as an unpleasant surprise when the volumes are upgraded to gp3, and the workload has a low concurrency. This can be an issue for not loaded, but latency-sensitive applications. Otherwise, for choosing storage, the same advice applies to random reads.


The difference between these seemingly similar devices is greatest when a low number of threads are used against the device. If the io workload is parallel enough, the devices behave very similarly.

The raw data for these measurements are available on GitHub:


Cloud infrastructure market kept growing in Q2, reaching $42B

It’s often said in baseball that a prospect has a high ceiling, reflecting the tremendous potential of a young player with plenty of room to get better. The same could be said for the cloud infrastructure market, which just keeps growing, with little sign of slowing down any time soon. The market hit $42 billion in total revenue with all major vendors reporting, up $2 billion from Q1.

Synergy Research reports that the revenue grew at a speedy 39% clip, the fourth consecutive quarter that it has increased. AWS led the way per usual, but Microsoft continued growing at a rapid pace and Google also kept the momentum going.

AWS continues to defy market logic, actually increasing growth by 5% over the previous quarter at 37%, an amazing feat for a company with the market maturity of AWS. That accounted for $14.81 billion in revenue for Amazon’s cloud division, putting it close to a $60 billion run rate, good for a market leading 33% share. While that share has remained fairly steady for a number of years, the revenue continues to grow as the market pie grows ever larger.

Microsoft grew even faster at 51%, and while Microsoft cloud infrastructure data isn’t always easy to nail down, with 20% of market share according to Synergy Research, that puts it at $8.4 billion as it continues to push upward with revenue up from $7.8 billion last quarter.

Google too continued its slow and steady progress under the leadership of Thomas Kurian, leading the growth numbers with a 54% increase in cloud revenue in Q2 on revenue of $4.2 billion, good for 10% market share, the first time Google Cloud has reached double figures in Synergy’s quarterly tracking data. That’s up from $3.5 billion last quarter.

Synergy Research cloud infrastructure market share chart.

Image Credits: Synergy Research

After the Big 3, Alibaba held steady over Q1 at 6% (but will only report this week), with IBM falling a point from Q1 to 4% as Big Blue continues to struggle in pure infrastructure as it makes the transition to more of a hybrid cloud management player.

John Dinsdale, chief analyst at Synergy, says that the Big 3 are spending big to help fuel this growth. “Amazon, Microsoft and Google in aggregate are typically investing over $25 billion in capex per quarter, much of which is going towards building and equipping their fleet of over 340 hyperscale data centers,” he said in a statement.

Meanwhile, Canalys had similar numbers, but saw the overall market slightly higher at $47 billion. Their market share broke down to Amazon with 31%, Microsoft with 22% and Google with 8% of that total number.

Canalys analyst Blake Murray says that part of the reason companies are shifting workloads to the cloud is to help achieve environmental sustainability goals as the cloud vendors are working toward using more renewable energy to run their massive data centers.

“The best practices and technology utilized by these companies will filter to the rest of the industry, while customers will increasingly use cloud services to relieve some of their environmental responsibilities and meet sustainability goals,” Murray said in a statement.

Regardless of whether companies are moving to the cloud to get out of the data center business or because they hope to piggyback on the sustainability efforts of the Big 3, companies are continuing a steady march to the cloud. With some estimates of worldwide cloud usage at around 25%, the potential for continued growth remains strong, especially with many markets still untapped outside the U.S.

That bodes well for the Big 3 and for other smaller operators who can find a way to tap into slices of market share that add up to big revenue. “There remains a wealth of opportunity for smaller, more focused cloud providers, but it can be hard to look away from the eye-popping numbers coming out of the Big 3,” Dinsdale said.

In fact, it’s hard to see the ceiling for these companies any time in the foreseeable future.


The 1-2-3 for PostgreSQL Logical Replication Using an RDS Snapshot

PostgreSQL Logical Replication RDS

We have several PostgreSQL versions that support logical decoding to replicate data changes from a source database to a target database, which is a cool and very powerful tool that gives the option to replicate all the tables in a database, only one schema, a specific set of tables or even only some columns/rows, also is a helpful method for version upgrades since the target database can run on a different (minor or major) PostgreSQL version.

PostgreSQL Logical Replication

Image from:

There are some cases when the databases have been hosted in the AWS Relational Database Service (RDS) which is the fully auto-managed solution offered by Amazon Web Services, there is no secret that choosing this option for our database backend comes with a level of vendor lock-in, and even when RDS offers some build-in replica solutions such as Multi-AZ or read-replicas sometimes we can take advantage of the benefits from logical replication.

In this post I will describe the simplest and basic steps I used to implement this replica solution avoiding the initial copy data from the source database to the target, creating the target instance from an RDS snapshot. Certainly, you can take advantage of this when you work with a big/huge data set and the initial copy could lead to high timeframes or network saturation.   

NOTE: The next steps were tested and used for a specific scenario and they are not intended to be an any-size solution, rather give some insight into how this can be made and most importantly, to stimulate your own creative thinking.  

The Scenario

Service Considerations

In this exercise, I wanted to perform a version upgrade from PostgreSQL v11.9 to PostgreSQL v12.5, we can perform a direct upgrade using the build-in option RDS offers, but that requires a downtime window that can vary depending on some of the next:

  • Is Multi-AZ enabled?
  • Are the auto backups enabled?
  • How transactional is the source database?

During the direct upgrade process, RDS takes a couple of new snapshots of the source instance, firstly at the beginning of the upgrade and finally when all the modifications are done, depending on how old is the previous backup and how many changes have been made on the datafiles the pre backup could take some time. Also, if the instance is Multi-AZ the process should upgrade both instances, which adds more time for the upgrade, during most of these actions the database remains inaccessible.

The next is a basic diagram of how an RDS Multi-AZ instance looks, all the client requests are sent to the master instance, while the replica is not accessible and some tasks like the backups are executed on it.

PostgreSQL Logical Replication on RDS

Therefore, I choose logical replication as the mechanism to achieve the objective, we can aim for a quicker switch-over if we create the new instance in the desired version and just replicate all the data changes, then we need a small downtime window just to move the traffic from the original instance to the upgraded new one.


To be able to perform these actions we would need:

  • An AWS user/access that can operate the DB instances, take DB snapshots and upgrade and restore them.
  • The AWS user also should be able to describe and create DB PARAMETER GROUPS.
  • A DB user with enough privileges to create the PUBLICATION on source and SUBSCRIPTION on target also is advisable to create a dedicated replication user with the minimum permissions. 

The 1-2-3 Steps

Per the title of this post, the next is the list of steps to set up a PostgreSQL logical replication between a PostgreSQL v11.9 and a v12.5 using an RDS snapshot to initialize the target database. 

  1. Verify the PostgreSQL parameters for logical replication
  2. Create the replication user and grant all the required privileges
  3. Create the PUBLICATION
  5. Create a new RDS snapshot 
  6. Upgrade the RDS snapshot to the target version
  7. Restore the upgraded RDS snapshot
  8. Get the LSN position 
  9. Create the SUBSCRIPTION
  10. Advance the SUBSCRIPTION 
  11. Enable the SUBSCRIPTION

Source Database Side

1. Verify the PostgreSQL parameters for logical replication

We require the next PostgreSQL parameters for this exercise

demodb=> select name,setting from pg_settings where name in (
        'max_wal_senders') ;
          name          | setting
 max_replication_slots  | 10
 max_wal_senders        | 10
 max_worker_processes   | 10
 track_commit_timestamp | on
 wal_level              | logical
(5 rows)

NOTE: The parameter track_commit_timestamp can be optional since in some environments is not advisable for the related overhead, but it would help to track and resolve any conflict that may occur when the subscriptions are started.

2. Create the replication user and grant all the required privileges

demodb=> CREATE USER pgrepuser WITH password 'SECRET';
demodb=> GRANT rds_replication TO pgrepuser;
demodb=> GRANT SELECT ON ALL TABLES IN SCHEMA public TO pgrepuser;

3. Create the PUBLICATION

demodb=> CREATE PUBLICATION pglogical_rep01 FOR ALL TABLES;


demodb=> SELECT pg_create_logical_replication_slot('pglogical_rep01', 'pgoutput');
(1 row)


5. Create a new RDS snapshot 

aws rds create-db-snapshot \
    --db-instance-identifier demodb-postgres\
    --db-snapshot-identifier demodb-postgres-to-125

6. Upgrade the RDS snapshot to the target version

aws rds modify-db-snapshot \
    --db-snapshot-identifier demodb-postgres-to-125 \
    --engine-version 12.5

7. Restore the upgraded RDS snapshot 

Since we are moving from version 11.9 to 12.5 we may need to create a new DB parameter group if we are using some custom parameters. 
From the instance describe we can verify the current parameter group

aws rds describe-db-instances \
        --db-instance-identifier demodb-postgres \| 
jq '.DBInstances | map({DBInstanceIdentifier: .DBInstanceIdentifier, DBParameterGroupName: .DBParameterGroups[0].DBParameterGroupName})'
    "DBInstanceIdentifier": "demodb-postgres",
    "DBParameterGroupName": "postgres11-logicalrep"

Then we can validate the custom parameters 

aws rds describe-db-parameters \
	--db-parameter-group-name postgres11-logicalrep \
	--query "Parameters[*].[ParameterName,ParameterValue]" \
	--source user --output text 
track_commit_timestamp	1

We need to create a new parameter group in the target version

aws rds create-db-parameter-group \
	--db-parameter-group-name postgres12-logicalrep \
	--db-parameter-group-family postgres12

Finally, we need to modify the parameters we got before in the new parameter group

aws rds modify-db-parameter-group \
	--db-parameter-group-name postgres12-logicalrep \
	--parameters "ParameterName='track_commit_timestamp',ParameterValue=1,ApplyMethod=immediate"

Now we can use the new parameter group to restore the upgraded snapshot

aws rds restore-db-instance-from-db-snapshot \
	--db-instance-identifier demodb-postgres-125 \
	--db-snapshot-identifier demodb-postgres-to-125 \
	--db-parameter-group-name postgres12-logicalrep

8. Get the LSN position from the target instance log

To list all the database logs for the new DB instance

aws rds describe-db-log-files \
	--db-instance-identifier demodb-postgres-125

We should pick the latest database log

aws rds download-db-log-file-portion \
	--db-instance-identifier demodb-postgres-125 \
	--log-file-name "error/postgresql.log.2021-03-23-18"

From the retrieved log portion we need to find the value after for the log entry redo done at:

2021-03-23 18:19:58 UTC::@:[5212]:LOG:  redo done at 3E/50000D08

Target Database Side


demodb=> CREATE SUBSCRIPTION pglogical_sub01 CONNECTION ' port=5432 dbname=demodb user=pgrepuser password=SECRET' PUBLICATION pglogical_rep01
  copy_data = false,
  create_slot = false,
  enabled = false,
  connect = true,
  slot_name = 'pglogical_rep01'

10. Advance the SUBSCRIPTION 

We need to get the subscription id

demodb=> SELECT 'pg_'||oid::text AS "external_id"
FROM pg_subscription 
WHERE subname = 'pglogical_sub01';
(2 rows)

Now advance the subscription to the LSN we got in step 8

demodb=> SELECT pg_replication_origin_advance('pg_73750', '3E/50000D08') ;
(1 row)

11. Enable the SUBSCRIPTION

demodb=> ALTER SUBSCRIPTION pglogical_sub01 ENABLE;

Once we are done with all the steps the data changes should flow from the source database to the target, we can check the status at the pg_stat_replication view. 


Choosing DBaaS from cloud vendors bring some advantages and can speed up some implementations, but they come with some costs, and not all the available tools or solutions fits all the requirements, that is why always is advisable to try some different approaches and think out of the box, technology can go so far as our imagination. 


Scaling CockroachDB in the red ocean of relational databases

Most database startups avoid building relational databases, since that market is dominated by a few goliaths. Oracle, MySQL and Microsoft SQL Server have embedded themselves into the technical fabric of large- and medium-size companies going back decades. These established companies have a lot of market share and a lot of money to quash the competition.

So rather than trying to compete in the relational database market, over the past decade, many database startups focused on alternative architectures such as document-centric databases (like MongoDB), key-value stores (like Redis) and graph databases (like Neo4J). But Cockroach Labs went against conventional wisdom with CockroachDB: It intentionally competed in the relational database market with its relational database product.

While it did face an uphill battle to penetrate the market, Cockroach Labs saw a surprising benefit: It didn’t have to invent a market. All it needed to do was grab a share of a market that also happened to be growing rapidly.

Cockroach Labs has a bright future, compelling technology, a lot of money in the bank and has an experienced, technically astute executive team.

In previous parts of this EC-1, I looked at the origins of CockroachDB, presented an in-depth technical description of its product as well as an analysis of the company’s developer relations and cloud service, CockroachCloud. In this final installment, we’ll look at the future of the company, the competitive landscape within the relational database market, its ability to retain talent as it looks toward a potential IPO or acquisition, and the risks it faces.

CockroachDB’s success is not guaranteed. It has to overcome significant hurdles to secure a profitable place for itself among a set of well-established database technologies that are owned by companies with very deep pockets.

It’s not impossible, though. We’ll first look at MongoDB as an example of how a company can break through the barriers for database startups competing with incumbents.

When life gives you Mongos, make MongoDB

Dev Ittycheria, MongoDB CEO, rings the Nasdaq Stock Market Opening Bell. Image Credits: Nasdaq, Inc

MongoDB is a good example of the risks that come with trying to invent a new database market. The company started out as a purely document-centric database at a time when that approach was the exception rather than the rule.

Web developers like document-centric databases because they address a number of common use cases in their work. For example, a document-centric database works well for storing comments to a blog post or a customer’s entire order history and profile.


3 analysts weigh in: What are Andy Jassy’s top priorities as Amazon’s new CEO?

It’s not easy following a larger-than-life founder and CEO of an iconic company, but that’s what former AWS CEO Andy Jassy faces this week as he takes over for Jeff Bezos, who moves into the executive chairman role. Jassy must deal with myriad challenges as he becomes the head honcho at the No. 2 company on the Fortune 500.

How he handles these challenges will define his tenure at the helm of the online retail giant. We asked several analysts to identify the top problems he will have to address in his new role.

Ensure a smooth transition

Handling that transition smoothly and showing investors and the rest of the world that it’s business as usual at Amazon is going to be a big priority for Jassy, said Robin Ody, an analyst at Canalys. He said it’s not unlike what Satya Nadella faced when he took over as CEO at Microsoft in 2014.

Handling the transition smoothly and showing investors and the rest of the world that it’s business as usual at Amazon is going to be a big priority for Jassy.

“The biggest task is that you’re following Jeff Bezos, so his overarching issue is going to be stability and continuity. … The eyes of the world are on that succession. So managing that I think is the overall issue and would be for anyone in the same position,” Ody said.

Forrester analyst Sucharita Kodali said Jassy’s biggest job is just to keep the revenue train rolling. “I think the biggest to-do is to just continue that momentum that the company has had for the last several years. He has to make sure that they don’t lose that. If he does that, I mean, he will win,” she said.

Maintain company growth

As an online retailer, the company has thrived during COVID, generating $386 billion in revenue in 2020, up more than $100 billion over the prior year. As Jassy takes over and things return to something closer to normal, will he be able to keep the revenue pedal to the metal?


Streaming MySQL Binlogs to S3 (or Any Object Storage)

Streaming MySQL Binlogs to S3

Streaming MySQL Binlogs to S3Problem Statement

Having backups of binary logs is fairly normal these days. The more recent binary logs are copied offsite, the better RPO (Recovery Point Objective) can be achieved. I was asked multiple times recently if something could be done to “stream” the binary logs to S3 as close to real-time as possible. Unfortunately, there is no readily available solution that would do that. Here, I show what can be done and also show a proof of concept implementation, which is not suitable for production use.


In this example, the instance has two binary log files (mysql-bin.000001 and mysql-bin.000002) already closed and mysql-bin.000003 being written. A trivial solution for backing up these binary log files would be to back up just the closed ones (the one that is not written). The default size of the binary log file is 1 GB. This means with this solution we would have a 1 GB binlog not backed up in the worst-case scenario. On average, we would have 500M binary logs not backed up. These numbers can be made better by lowering the max_binlog_size parameter, but that will lead to a slew of files and frequent rotation.

Uploading to Object Storage

It is possible to upload files to S3 in chunks using multipart uploads. With this the file can be uploaded in chunks, the minimum chunk size is 5 MB. This means that a binlog can be read by another process while it’s written, and uploaded to S3 in 5 MB chunks. That’s definitely better than the 500 MB of the file copying or setting the max_binlog_size to 5M. Another thing that our backup solution could do is stream the binary logs to a remote location before it uploads them to S3.


The chunks are assembled together again on S3 when the multipart upload is finished.

The files produced by the mysqlbinlog command can be read and if we have 5M, they can be uploaded to S3. The last chunk of a multipart upload can be less than 5M. With this, it’s guaranteed that the file can be uploaded. The file is read in chunks while it’s written.

Proof of Concept Implementation

The flow of the proof of concept implementation is the following.

  • Start a mysqlbinlog process to stream the backup to a temporary directory. This is in the background.
  • Read the files in chunks in the temporary directory. It might happen that a read() call will return with less than 5MB worth of data. In order to handle this case, there is a buffer for the read call. If the buffer reaches the minimum chunk size (5M) for multipart upload, we will upload it. This means that it can happen that 4.5M is already read with several small reads to the buffer, and the next read() call will be able to read 5M. In this case, the size of that chunk will be 9.5M. This is totally fine, the chunks can be variable in size. The goal is to upload the data as soon as possible, so it’s better to do it in one request. This means that in this proof of concept implementation, the chunk sizes will be between 5M and 10M.
  • Once the end of the file is reached, the final part is uploaded regardless of size, and the file will be closed, a chunk from the next file will be read next. The final part in a multipart upload can be less than 5M. After a file is successfully uploaded to S3 in full, the file is deleted from the local temp directory. So, the local temp directory holds files that are either being uploaded, or they didn’t start to upload yet.
  • If the reader is on the last, not closed file, it will just wait for more data, when the buffer fills, it will continue to upload parts.


In this example, I have a server with two binlogs:

mysql> show binary logs;
| Log_name | File_size | Encrypted |
| mysql-bin.000001 | 105775625 | No |
| mysql-bin.000002 | 85147151 | No |
2 rows in set (0.00 sec)

The max_binlog_size is 100M for the sake of convenience.

$ binlog2s3 --binary /usr/local/bin/mysqlbinlog --hostname --port 3306 --username repl --password repl --start-file mysql-bin.000001 --tempdir /Users/pboros/tmpdir --bucket_name pboros-binlogtest
Waiting for binlog files to appear
2021-07-01 17:45:41.672730 Creating multipart uploader for mysql-bin.000001
2021-07-01 17:45:42.460344 Uploading part 1 for mysql-bin.000001 size 5242880
2021-07-01 17:45:51.465913 Uploading part 2 for mysql-bin.000001 size 5242880

The temporary directory has the binary logs:

$ ls -la
total 372896
drwxr-xr-x 4 pboros staff 128 Jul 1 17:45 .
drwxr-xr-x+ 73 pboros staff 2336 Jun 30 18:04 ..
-rw-r----- 1 pboros staff 105256799 Jul 1 17:45 mysql-bin.000001
-rw-r----- 1 pboros staff 85663391 Jul 1 17:45 mysql-bin.000002

In this case, streaming the binary logs from the beginning is much faster than uploading them to S3 (because I am streaming from a virtual machine locally, and I am uploading to S3 on a home internet connection).

Soon enough the binlog will be uploaded:

2021-07-01 17:48:23.865630 Uploading part 19 for mysql-bin.000001 size 5242880
2021-07-01 17:48:33.350739 Uploading part 20 for mysql-bin.000001 size 5242880
2021-07-01 17:48:41.708166 Uploading part 21 for mysql-bin.000001 size 399199
2021-07-01 17:48:42.160303 Finishing multipart upload for mysql-bin.000001
2021-07-01 17:48:42.407308 Creating multipart uploader for mysql-bin.000002
2021-07-01 17:48:43.521756 Uploading part 1 for mysql-bin.000002 size 5242880
2021-07-01 17:48:52.517424 Uploading part 2 for mysql-bin.000002 size 5242880

Part 17 will be bigger because it has less than a 5M buffer from the time when there were new binary logs, and when new data became available. It could read an additional 5M on top of that.

$ ls -la
total 593496
drwxr-xr-x 5 pboros staff 160 Jul 1 17:52 .
drwxr-xr-x+ 73 pboros staff 2336 Jun 30 18:04 ..
-rw-r----- 1 pboros staff 105267370 Jul 1 17:52 mysql-bin.000002
-rw-r----- 1 pboros staff 105255295 Jul 1 17:52 mysql-bin.000003
-rw-r----- 1 pboros staff 66061395 Jul 1 17:52 mysql-bin.000004

$ aws s3 ls s3://pboros-binlogtest/
2021-07-01 17:45:43 105256799 mysql-bin.000001
2021-07-01 17:48:43 105267370 mysql-bin.000002

The uploaded parts are accessible with the S3 API (and they can be assembled to binlogs):

$ aws s3api list-multipart-uploads --bucket pboros-binlogtest

The S3 bucket can have a policy to auto-delete not finished multipart uploads periodically (for example unfinished multipart uploads that are older than 7 days).

The proof of concept code is available at


The single vendor requirement ultimately doomed the DoD’s $10B JEDI cloud contract

When the Pentagon killed the JEDI cloud program yesterday, it was the end of a long and bitter road for a project that never seemed to have a chance. The question is why it didn’t work out in the end, and ultimately I think you can blame the DoD’s stubborn adherence to a single vendor requirement, a condition that never made sense to anyone, even the vendor that ostensibly won the deal.

In March 2018, the Pentagon announced a mega $10 billion, decade-long cloud contract to build the next generation of cloud infrastructure for the Department of Defense. It was dubbed JEDI, which aside from the Star Wars reference, was short for Joint Enterprise Defense Infrastructure.

The idea was a 10-year contract with a single vendor that started with an initial two-year option. If all was going well, a five-year option would kick in and finally a three-year option would close things out with earnings of $1 billion a year.

While the total value of the contract had it been completed was quite large, a billion a year for companies the size of Amazon, Oracle or Microsoft is not a ton of money in the scheme of things. It was more about the prestige of winning such a high-profile contract and what it would mean for sales bragging rights. After all, if you passed muster with the DoD, you could probably handle just about anyone’s sensitive data, right?

Regardless, the idea of a single-vendor contract went against conventional wisdom that the cloud gives you the option of working with the best-in-class vendors. Microsoft, the eventual winner of the ill-fated deal acknowledged that the single vendor approach was flawed in an interview in April 2018:

Leigh Madden, who heads up Microsoft’s defense effort, says he believes Microsoft can win such a contract, but it isn’t necessarily the best approach for the DoD. “If the DoD goes with a single award path, we are in it to win, but having said that, it’s counter to what we are seeing across the globe where 80% of customers are adopting a multicloud solution,” Madden told TechCrunch.

Perhaps it was doomed from the start because of that. Yet even before the requirements were fully known there were complaints that it would favor Amazon, the market share leader in the cloud infrastructure market. Oracle was particularly vocal, taking its complaints directly to the former president before the RFP was even published. It would later file a complaint with the Government Accountability Office and file a couple of lawsuits alleging that the entire process was unfair and designed to favor Amazon. It lost every time — and of course, Amazon wasn’t ultimately the winner.

While there was a lot of drama along the way, in April 2019 the Pentagon named two finalists, and it was probably not too surprising that they were the two cloud infrastructure market leaders: Microsoft and Amazon. Game on.

The former president interjected himself directly in the process in August that year, when he ordered the Defense Secretary to review the matter over concerns that the process favored Amazon, a complaint which to that point had been refuted several times over by the DoD, the Government Accountability Office and the courts. To further complicate matters, a book by former defense secretary Jim Mattis claimed the president told him to “screw Amazon out of the $10 billion contract.” His goal appeared to be to get back at Bezos, who also owns the Washington Post newspaper.

In spite of all these claims that the process favored Amazon, when the winner was finally announced in October 2019, late on a Friday afternoon no less, the winner was not in fact Amazon. Instead, Microsoft won the deal, or at least it seemed that way. It wouldn’t be long before Amazon would dispute the decision in court.

By the time AWS re:Invent hit a couple of months after the announcement, former AWS CEO Andy Jassy was already pushing the idea that the president had unduly influenced the process.

“I think that we ended up with a situation where there was political interference. When you have a sitting president, who has shared openly his disdain for a company, and the leader of that company, it makes it really difficult for government agencies, including the DoD, to make objective decisions without fear of reprisal,” Jassy said at that time.

Then came the litigation. In November the company indicated it would be challenging the decision to choose Microsoft charging that it was was driven by politics and not technical merit. In January 2020, Amazon filed a request with the court that the project should stop until the legal challenges were settled. In February, a federal judge agreed with Amazon and stopped the project. It would never restart.

In April the DoD completed its own internal investigation of the contract procurement process and found no wrongdoing. As I wrote at the time:

While controversy has dogged the $10-billion, decade-long JEDI contract since its earliest days, a report by the DoD’s inspector general’s office concluded today that, while there were some funky bits and potential conflicts, overall the contract procurement process was fair and legal and the president did not unduly influence the process in spite of public comments.

Last September the DoD completed a review of the selection process and it once again concluded that Microsoft was the winner, but it didn’t really matter as the litigation was still in motion and the project remained stalled.

The legal wrangling continued into this year, and yesterday the Pentagon finally pulled the plug on the project once and for all, saying it was time to move on as times have changed since 2018 when it announced its vision for JEDI.

The DoD finally came to the conclusion that a single-vendor approach wasn’t the best way to go, and not because it could never get the project off the ground, but because it makes more sense from a technology and business perspective to work with multiple vendors and not get locked into any particular one.

“JEDI was developed at a time when the Department’s needs were different and both the CSPs’ (cloud service providers) technology and our cloud conversancy was less mature. In light of new initiatives like JADC2 (the Pentagon’s initiative to build a network of connected sensors) and AI and Data Acceleration (ADA), the evolution of the cloud ecosystem within DoD, and changes in user requirements to leverage multiple cloud environments to execute mission, our landscape has advanced and a new way ahead is warranted to achieve dominance in both traditional and nontraditional warfighting domains,” said John Sherman, acting DoD chief information officer in a statement.

In other words, the DoD would benefit more from adopting a multicloud, multivendor approach like pretty much the rest of the world. That said, the department also indicated it would limit the vendor selection to Microsoft and Amazon.

“The Department intends to seek proposals from a limited number of sources, namely the Microsoft Corporation (Microsoft) and Amazon Web Services (AWS), as available market research indicates that these two vendors are the only Cloud Service Providers (CSPs) capable of meeting the Department’s requirements,” the department said in a statement.

That’s not going to sit well with Google, Oracle or IBM, but the department further indicated it would continue to monitor the market to see if other CSPs had the chops to handle their requirements in the future.

In the end, the single vendor requirement contributed greatly to an overly competitive and politically charged atmosphere that resulted in the project never coming to fruition. Now the DoD has to play technology catch-up, having lost three years to the histrionics of the entire JEDI procurement process and that could be the most lamentable part of this long, sordid technology tale.


Dispense with the chasm? No way!

Jeff Bussgang, a co-founder and general partner at Flybridge Capital, recently wrote an Extra Crunch guest post that argued it is time for a refresh when it comes to the technology adoption life cycle and the chasm. His argument went as follows:

  1. VCs in recent years have drastically underestimated the size of SAMs (serviceable addressable markets) for their startup investments because they were “trained to think only a portion of the SAM is obtainable within any reasonable window of time because of the chasm.”
  2. The chasm is no longer the barrier it once was because businesses have finally understood that software is eating the world.
  3. As a result, the early majority has joined up with the innovators and early adopters to create an expanded early market. Effectively, they have defected from the mainstream market to cross the chasm in the other direction, leaving only the late majority and the laggards on the other side.
  4. That is why we now are seeing multiple instances of very large high-growth markets that appear to have no limit to their upside. There is no chasm to cross until much later in the life cycle, and it isn’t worth much effort to cross it then.

Now, I agree with Jeff that we are seeing remarkable growth in technology adoption at levels that would have astonished investors from prior decades. In particular, I agree with him when he says:

The pandemic helped accelerate a global appreciation that digital innovation was no longer a luxury but a necessity. As such, companies could no longer wait around for new innovations to cross the chasm. Instead, everyone had to embrace change or be exposed to an existential competitive disadvantage.

But this is crossing the chasm! Pragmatic customers are being forced to adopt because they are under duress. It is not that they buy into the vision of software eating the world. It is because their very own lunches are being eaten. The pandemic created a flotilla of chasm-crossings because it unleashed a very real set of existential threats.

The key here is to understand the difference between two buying decision processes, one governed by visionaries and technology enthusiasts (the early adopters and innovators), the other by pragmatists (the early majority).

The key here is to understand the difference between two buying decision processes, one governed by visionaries and technology enthusiasts (the early adopters and innovators), the other by pragmatists (the early majority). The early group makes their decisions based on their own analyses. They do not look to others for corroborative support. Pragmatists do. Indeed, word-of-mouth endorsements are by far the most impactful input not only about what to buy and when but also from whom.

Powered by WordPress | Theme: Aeros 2.0 by