Extra Crunch roundup: Antitrust jitters, SPAC odyssey, white-hot IPOs, more

Some time ago, I gave up on the idea of finding a thread that connects each story in the weekly Extra Crunch roundup; there are no unified theories of technology news.

The stories that left the deepest impression were related to two news pegs that dominated the week — Visa and Plaid calling off their $5.3 billion acquisition agreement, and sizzling-hot IPOs for Affirm and Poshmark.

Watching Plaid and Visa sing “Let’s Call The Whole Thing Off” in harmony after the U.S. Department of Justice filed a lawsuit to block their deal wasn’t shocking. But I was surprised to find myself editing an interview Alex Wilhelm conducted with Plaid CEO Zach Perret the next day in which the executive said growing the company on its own is “once again” the correct strategy.

Full Extra Crunch articles are only available to members
Use discount code ECFriday to save 20% off a one- or two-year subscription

In an analysis for Extra Crunch, Managing Editor Danny Crichton suggested that federal regulators’ new interest in antitrust enforcement will affect valuations going forward. For example, Procter & Gamble and women’s beauty D2C brand Billie also called off their planned merger last week after the Federal Trade Commission raised objections in December.

Given the FTC’s moves last year to prevent Billie and Harry’s from being acquired, “it seems clear that U.S. antitrust authorities want broad competition for consumers in household goods,” Danny concluded, and I suspect that applies to Plaid as well.

In December, C3.ai, Doordash and Airbnb burst into the public markets to much acclaim. This week, used clothing marketplace Poshmark saw a 140% pop in its first day of trading and consumer-financing company Affirm “priced its IPO above its raised range at $49 per share,” reported Alex.

In a post titled “A theory about the current IPO market”, he identified eight key ingredients for brewing a debut with a big first-day pop, which includes “exist in a climate of near-zero interest rates” and “keep companies private longer.” Truly, words to live by!

Come back next week for more coverage of the public markets in The Exchange, an interview with Bustle CEO Bryan Goldberg where he shares his plans for taking the company public, a comprehensive post that will unpack the regulatory hurdles facing D2C consumer brands, and much more.

If you live in the U.S., enjoy your MLK Day holiday weekend, and wherever you are: Thanks very much for reading Extra Crunch.

Walter Thompson
Senior Editor, TechCrunch


Rapid growth in 2020 reveals OKR software market’s untapped potential

After spending much of the week covering 2021’s frothy IPO market, Alex Wilhelm devoted this morning’s column to studying the OKR-focused software sector.

Measuring objectives and key results are core to every enterprise, perhaps more so these days since knowledge workers began working remotely in greater numbers last year.

A sign of the times: This week, enterprise orchestration SaaS platform Gtmhub announced that it raised a $30 million Series B.

To get a sense of how large the TAM is for OKR, Alex reached out to several companies and asked them to share new and historical growth metrics:

  • Gthmhub
  • Perdoo
  • WorkBoard
  • Ally.io
  • Koan
  • WeekDone

“Some OKR-focused startups didn’t get back to us, and some leaders wanted to share the best stuff off the record, which we grant at times for candor amongst startup executives,” he wrote.

5 consumer hardware VCs share their 2021 investment strategies

For our latest investor survey, Matt Burns interviewed five VCs who actively fund consumer electronics startups:

  • Hans Tung, managing partner, GGV Capital
  • Dayna Grayson, co-founder and general partner, Construct Capital
  • Cyril Ebersweiler, general partner, SOSV
  • Bilal Zuberi, partner, Lux Capital
  • Rob Coneybeer, managing director, Shasta Ventures

“Consumer hardware has always been a tough market to crack, but the COVID-19 crisis made it even harder,” says Matt, noting that the pandemic fueled wide interest in fitness startups like Mirror, Peloton and Tonal.

Bonus: Many VCs listed the founders, investors and companies that are taking the lead in consumer hardware innovation.

A theory about the current IPO market

Digital generated image of abstract multi colored curve chart on white background.

Image Credits: Getty Images/Andriy Onufriyenko

If you’re looking for insight into “why everything feels so damn silly this year” in the public markets, a post Alex wrote Thursday afternoon might offer some perspective.

As someone who pays close attention to late-stage venture markets, he’s identified eight factors that are pushing debuts for unicorns like Affirm and Poshmark into the stratosphere.

TL;DR? “Lots of demand, little supply, boom goes the price.”

Poshmark prices IPO above range as public markets continue to YOLO startups

Clothing resale marketplace Poshmark closed up more than 140% on its first trading day yesterday.

In Thursday’s edition of The Exchange, Alex noted that Poshmark boosted its valuation by selling 6.6 million shares at its IPO price, scooping up $277.2 million in the process.

Poshmark’s surge in trading is good news for its employees and stockholders, but it reflects poorly on “the venture-focused money people who we suppose know what they are talking about when it comes to equity in private companies,” he says.

Will startup valuations change given rising antitrust concerns?

GettyImages 926051128

Image Credits: monsitj/Getty Images

This week, Visa announced it would drop its planned acquisition of Plaid after the U.S. Department of Justice filed suit to block it last fall.

Last week, Procter & Gamble called off its purchase of Billie, a women’s beauty products startup — in December, the U.S. Federal Trade Commission sued to block that deal, too.

Once upon a time, the U.S. government took an arm’s-length approach to enforcing antitrust laws, but the tide has turned, says Managing Editor Danny Crichton.

Going forward, “antitrust won’t kill acquisitions in general, but it could prevent the buyers with the highest reserve prices from entering the fray.”

Dear Sophie: What’s the new minimum salary required for H-1B visa applicants?

Image Credits: Sophie Alcorn

Dear Sophie:

I’m a grad student currently working on F-1 STEM OPT. The company I work for has indicated it will sponsor me for an H-1B visa this year.

I hear the random H-1B lottery will be replaced with a new system that selects H-1B candidates based on their salaries.

How will this new process work?

— Positive in Palo Alto

Venture capitalists react to Visa-Plaid deal meltdown

A homemade chocolate cookie with a bite and crumbs on a white background

Image Credits: Ana Maria Serrano/Getty Images

After news broke that Visa’s $5.3 billion purchase of API startup Plaid fell apart, Alex Wilhelm and Ron Miller interviewed several investors to get their reactions:

  • Anshu Sharma, co-founder and CEO, SkyflowAPI
  • Amy Cheetham, principal, Costanoa Ventures
  • Sheel Mohnot, co-founder, Better Tomorrow Ventures
  • Lucas Timberlake, partner, Fintech Ventures
  • Nico Berardi, founder and general partner, ANIMO Ventures
  • Allen Miller, VC, Oak HC/FT
  • Sri Muppidi, VC, Sierra Ventures
  • Christian Lassonde, VC, Impression Ventures

Plaid CEO touts new ‘clarity’ after failed Visa acquisition

Zach Perret, chief executive officer and co-founder of Plaid Technologies Inc., speaks during the Silicon Slopes Tech Summit in Salt Lake City, Utah, U.S., on Friday, Jan. 31, 2020. The summit brings together the leading minds in the tech industry for two-days of keynote speakers, breakout sessions, and networking opportunities. Photographer: George Frey/Bloomberg via Getty Images

Image Credits: George Frey/Bloomberg/Getty Images

Alex Wilhelm interviewed Plaid CEO Zach Perret after the Visa acquisition was called off to learn more about his mindset and the company’s short-term plans.

Perret, who noted that the last few years have been a “roller coaster,” said the Visa deal was the right decision at the time, but going it alone is “once again” Plaid’s best way forward.

2021: A SPAC odyssey

In Tuesday’s edition of The Exchange, Alex Wilhelm took a closer look at blank-check offerings for digital asset marketplace Bakkt and personal finance platform SoFi.

To create a detailed analysis of the investor presentations for both offerings, he tried to answer two questions:

  1. Are special purpose acquisition companies a path to public markets for “potentially promising companies that lacked obvious, near-term growth stories?”
  2. Given the number of unicorns and the limited number of companies that can IPO at any given time, “maybe SPACS would help close the liquidity gap?”

Flexible VC: A new model for startups targeting profitability

12 ‘flexible VCs’ who operate where equity meets revenue share

Spotlit Multi Colored Coil Toy in the Dark.

Image Credits: MirageC/Getty Images

Growth-stage startups in search of funding have a new option: “flexible VC” investors.

An amalgam of revenue-based investment and traditional VC, investors who fall into this category let entrepreneurs “access immediate risk capital while preserving exit, growth trajectory and ownership optionality.”

In a comprehensive explainer, fund managers David Teten and Jamie Finney present different investment structures so founders can get a clear sense of how flexible VC compares to other venture capital models. In a follow-up post, they share a list of a dozen active investors who offer funding via these nontraditional routes.

These 5 VCs have high hopes for cannabis in 2021

Marijuana leaf on a yellow background.

Image Credits: Anton Petrus (opens in a new window)/Getty Images

For some consumers, “cannabis has always been essential,” writes Matt Burns, but once local governments allowed dispensaries to remain open during the pandemic, it signaled a shift in the regulatory environment and investors took notice.

Matt asked five VCs about where they think the industry is heading in 2021 and what advice they’re offering their portfolio companies:


GitLab oversaw a $195 million secondary sale that values the company at $6 billion

GitLab has confirmed with TechCrunch that it oversaw a $195 million secondary sale that values the company at $6 billion. CNBC broke the story earlier today.

The company’s impressive valuation comes after its most recent 2019 Series E in which it raised $268 million on a 2.75 billion valuation, an increase of $3.25 billion in under 18 months. Company co-founder and CEO Sid Sijbrandij believes the increase is due to his company’s progress adding functionality to the platform.

“We believe the increase in valuation over the past year reflects the progress of our complete DevOps platform towards realizing a greater share of the growing, multi-billion dollar software development market,” he told TechCrunch.

While the startup has raised over $434 million, this round involved buying employee stock options, a move that allows the company’s workers to cash in some of their equity prior to going public. CNBC reported that the firms buying the stock included Alta Park, HMI Capital, OMERS Growth Equity, TCV and Verition.

The next logical step would appear to be IPO, something the company has never shied away from. In fact, it actually at one point included the proposed date of November 18, 2020 as a target IPO date on the company wiki. While they didn’t quite make that goal, Sijbrandij still sees the company going public at some point. He’s just not being so specific as in the past, suggesting that the company has plenty of runway left from the last funding round and can go public when the timing is right.

“We continue to believe that being a public company is an integral part of realizing our mission. As a public company, GitLab would benefit from enhanced brand awareness, access to capital, shareholder liquidity, autonomy and transparency,” he said.

He added, “That said, we want to maximize the outcome by selecting an opportune time. Our most recent capital raise was in 2019 and contributed to an already healthy balance sheet. A strong balance sheet and business model enables us to select a period that works best for realizing our long-term goals.”

GitLab has not only published IPO goals on its Wiki, but its entire company philosophy, goals and OKRs for everyone to see. Sijbrandij told TechCrunch’s Alex Wilhelm at a TechCrunch Disrupt panel in September that he believes that transparency helps attract and keep employees. It doesn’t hurt that the company was and remains a fully remote organization, even pre-COVID.

“We started [this level of] transparency to connect with the wider community around GitLab, but it turned out to be super beneficial for attracting great talent as well,” Sijbrandij told Wilhelm in September.

The company, which launched in 2014, offers a DevOps platform to help move applications through the programming lifecycle.

Update: The original headline of this story has been changed from ‘GitLab raises $195M in secondary funding on $6 billion valuation.’



Twilio CEO Jeff Lawson says wisdom lies with your developers

Twilio CEO Jeff Lawson knows a thing or two about unleashing developers. His company has garnered a market cap of almost $60 billion by creating a set of tools to make it easy for programmers to insert a whole host of communications functionality into an application with a couple of lines of code. Given that background, perhaps it shouldn’t come as a surprise that Lawson has written a book called “Ask Your Developer,” which hit the stores this week.

Lawson’s basic philosophy is that if you can build it, you should.

Lawson’s basic philosophy in the book is that if you can build it, you should. In every company, there is build versus buy calculus that goes into every software decision. Lawson believes deeply that there is incredible power in building yourself instead of purchasing something off the shelf. By using components like the ones from his company, and many others delivering specialized types functionality via API, you can build what your customers need instead of just buying what the vendors are giving you.

While Lawson recognizes this isn’t always possible, he says that by asking your developers, you can begin to learn when it makes sense to build and when it doesn’t. These discussions should stem from customer problems and companies should seek digital solutions with the input of the developer group.

Building great customer experiences

Lawson posits that you can build a better customer experience because you understand your customers so much more  acutely than a generic vendor ever could. “Basically, what you see happening across nearly every industry is that the companies that are able to listen to their customers and hear what the customers need and then build really great digital products and experiences — well, they tend to win the hearts, minds and wallets of their customers,” Lawson told me in an interview about the book this week.

Billboard for book Ask your Developer by Jeff Lawson, CEO of Twilio

Image Credits: Twilio (image has been cropped)

He says that this has caused a shift in how companies perceive IT departments. They have gone from cost centers that provision laptops and buy HR software to something more valuable, helping produce digital products that have a direct impact on the business’s bottom line.

He uses banking as an example in the book. It used to be you judged a bank by a set of criteria like how nice the lobby was, if the tellers were friendly and if they gave your kid a free lollipop. Today, that’s all changed and it’s all about the quality of the mobile app.

“Nowadays your bank is a mobile app and you like your bank if the software is fast, if it is bug free and if they regularly update it with new features and functionality that makes your life better [ … ]. And that same transformation has been happening in nearly every industry and so when you think about it, you can’t buy differentiation if every bank just bought the same mobile app from some vendor and just off the shelf deployed it,” he said.


MySQL 8.0.22: SHOW PROCESSLIST Version 2 – Now Available From PERFORMANCE_SCHEMA


SHOW PROCESSLIST Version 2The “SHOW PROCESSLIST” command is very famous and very useful for MySQL DBAs. It will help you to understand the ongoing thread activities and their current states. By default, the “show processlist” output details will be collected from the thread manager, and it needs the global mutex. From MySQL 8.0.22, we have an alternative way to get the process details from the PERFORMANCE_SCHEMA. It doesn’t need the global mutex. ?

Note: We also have the non-blocking SYS schema views “processlist” and “x$processlist”, which provide more complete information than the SHOW PROCESSLIST statement and the INFORMATION_SCHEMA.PROCESSLIST and PERFORMANCE_SCHEMA.PROCESSLIST. But, we can’t integrate this with the “SHOW PROCESSLIST” command.

In this blog, I am going to explain the complete details about the new processlist implementation using PERFORMANCE_SCHEMA.

“SHOW PROCESSLIST” Using Thread Manager (default)

  • This is the default method.
  • The default “show processlist” implementation iterates across active threads from within the thread manager while holding a global mutex.
  • Negatively impacts performance.
  • Particularly impacts the busy systems quite badly.
  • The INFORMATION_SCHEMA.PROCESSLIST is one of the sources of process information. This will also use the thread manager to collect the metrics.
  • By default, “mysqladmin processlist” also uses the thread manager to get the details.

The following statements are equivalent:

Mysqladmin processlist --verbose

“SHOW PROCESSLIST” Using Performance Schema

  • Available from MySQL 8.0.22.
  • It collects the thread details from the PERFORMANCE_SCHEMA>PROCESSLIST table.
  • Global mutex is not needed.
  • Helps to avoid the performance impact during querying the “show processlist”, particularly in busy systems.
  • The implementation also applies to “mysqladmin processlist”

The following statements are equivalent:

Mysqladmin processlist --verbose


mysql> desc performance_schema.processlist;
| Field   | Type            | Null | Key | Default | Extra |
| ID      | bigint unsigned | NO   | PRI | NULL    |       |
| USER    | varchar(32)     | YES  |     | NULL    |       |
| HOST    | varchar(255)    | YES  |     | NULL    |       |
| DB      | varchar(64)     | YES  |     | NULL    |       |
| COMMAND | varchar(16)     | YES  |     | NULL    |       |
| TIME    | bigint          | YES  |     | NULL    |       |
| STATE   | varchar(64)     | YES  |     | NULL    |       |
| INFO    | longtext        | YES  |     | NULL    |       |
8 rows in set (0.00 sec)

mysql> desc information_schema.processlist;
| Field   | Type            | Null | Key | Default | Extra |
| ID      | bigint unsigned | NO   |     |         |       |
| USER    | varchar(32)     | NO   |     |         |       |
| HOST    | varchar(261)    | NO   |     |         |       |
| DB      | varchar(64)     | YES  |     |         |       |
| COMMAND | varchar(16)     | NO   |     |         |       |
| TIME    | int             | NO   |     |         |       |
| STATE   | varchar(64)     | YES  |     |         |       |
| INFO    | varchar(65535)  | YES  |     |         |       |
8 rows in set (0.00 sec)


  • Make sure the PERFORMANCE_SCHEMA is enabled at the server startup.
  • Make sure MySQL was configured and built with the thread instrumentations enabled.

MySQL provides a variable “performance_schema_show_processlist” to enable this feature. Once we enable the variable, the “SHOW PROCESSLIST” command will start to show the details from the “PERFORMANCE_SCHEMA.PROCESSLIST” table instead of the thread manager.

The variable has a global scope, no need to restart the MySQL server.

mysql> show global variables like 'performance_schema_show_processlist';
| Variable_name                       | Value |
| performance_schema_show_processlist | OFF   |
1 row in set (0.08 sec)

mysql> set global performance_schema_show_processlist='ON';
Query OK, 0 rows affected (0.00 sec)

mysql> \r
Connection id:    23
Current database: *** NONE ***

mysql> show global variables like 'performance_schema_show_processlist';
| Variable_name                       | Value |
| performance_schema_show_processlist | ON    |
1 row in set (0.00 sec)



mysql> show processlist\G
*************************** 1. row ***************************
     Id: 5
   User: event_scheduler
   Host: localhost
     db: NULL
Command: Daemon
   Time: 2461
  State: Waiting on empty queue
   Info: NULL
*************************** 2. row ***************************
     Id: 23
   User: root
   Host: localhost
     db: NULL
Command: Query
   Time: 0
  State: executing
   Info: show processlist
2 rows in set (0.00 sec)

You can also query the “performance_schema.processlist” table to get the thread information.

mysql> select * from performance_schema.processlist\G
*************************** 1. row ***************************
     ID: 5
   USER: event_scheduler
   HOST: localhost
     DB: NULL
   TIME: 2448
  STATE: Waiting on empty queue
*************************** 2. row ***************************
     ID: 23
   USER: root
   HOST: localhost
     DB: NULL
   TIME: 0
  STATE: executing
   INFO: select * from performance_schema.processlist
2 rows in set (0.00 sec)

“mysqladmin processlist” output from “performance_schema”:

[root@mysql8 vagrant]# mysqladmin processlist
| Id | User            | Host      | db | Command | Time | State                  | Info             |
| 5  | event_scheduler | localhost |    | Daemon  | 2631 | Waiting on empty queue |                  |
| 24 | root            | localhost |    | Query   | 0    | executing              | show processlist |


  • To avoid having some threads ignored, leave the “performance_schema_max_thread_instances” and “performance_schema_max_thread_classes” system variables set to their default value (default = -1, meaning the parameter will be autosized during the server startup).
  • To avoid having some STATE column values be empty, leave the “performance_schema_max_stage_classes” system variable set to its default (default = -1, meaning the parameter will be autosized during the server startup).

Rapid growth in 2020 reveals OKR software market’s untapped potential

Last year, a number of startups building OKR-focused software raised lots of venture capital, drawing TechCrunch’s attention.

Why is everyone making software that measures objectives and key results? we wondered with tongue in cheek. After all, how big could the OKR software market really be?

It’s a subniche of corporate planning tools! In a world where every company already pays for Google or Microsoft’s productivity suite, and some big software companies offer similar planning support, how substantial could demand prove for pure-play OKR startups?

The Exchange explores startups, markets and money. Read it every morning on Extra Crunch, or get The Exchange newsletter every Saturday.

Pretty substantial, we’re finding out. After OKR-focused Gtmhub announced its $30 million Series B the other day, The Exchange reached out to a number of OKR-focused startups we’ve previously covered and asked about their 2020 growth.

Gtmhub had released new growth metrics along with its funding news, plus we had historical growth data from some other players in the space. So let’s peek at new and historical numbers from Gthmhub, Perdoo, WorkBoard, Ally.io, Koan and WeekDone.

Growth (and some caveats)

A startup growing 400% in a year from a $50,000 ARR base is not impressive. It would be much more impressive to grow 200% from $1 million ARR, or 150% from $5 million.

So, percentage growth is only so good, as metrics go. But it’s also one that private companies are more likely to share than hard numbers, as the market has taught startups that sharing real data is akin to drowning themselves. Alas.

As we view the following, bear in mind that a simply higher percentage growth number does not indicate that a company added more net ARR than another; it could be growing faster from a smaller base. And some companies in the mix did not share ARR growth, but instead disclosed other bits of data. We got what we could.


  • 400% ARR growth, 2019.
  • 300% ARR growth, 2020.
  • More: The company has seen strong ACV growth and its reportedly strong gross margins from 2019 held up in 2020, it said.
  • TechCrunch coverage


  • 240% paid customer growth, 2020.
  • 340% user base growth, 2020.
  • Given strong market demand, a company representative told The Exchange that Perdoo had to restrict its free tier to 10 users.
  • TechCrunch coverage



How to Store MySQL Audit Logs in MongoDB in a Maintenance-Free Setup

Store MySQL Audit Logs in MongoDB

Store MySQL Audit Logs in MongoDBI was once helping one of our customers on how to load MySQL audit logs into a MySQL database and analyze them. But immediately I thought: “Hey, this is not the most efficient solution! MySQL or typical RDBMS, in general, were not really meant to store logs after all.”

So, I decided to explore an alternative – which seemed more sensible to me – and use MongoDB as the storage for logs, for three main reasons:

  • schema-less nature fits well to the audit log nature, where different types of events may use different fields
  • speaks JSON natively and the audit plugin can use JSON format
  • has capped collections feature, which allows avoiding additional maintenance overhead

Just to mention, audit logging is available in MySQL Enterprise Edition but a similar, yet free, solution, is available in Percona Server for MySQL. In both cases, it works by installing the audit log plugin.

Ad Hoc Import

The simplest scenario is to just set the audit log format to JSON:

audit_log_format = JSON

And as soon as it collects some data, import the log file into MongoDB collection via the mongoimport command, like this:

# mongoimport --username percona --password P3rc0n4 --host --port 27017 --db auditlogs --collection audit1 --file /var/lib/mysql/audit.log
2020-12-31T16:24:43.782+0000 connected to:
2020-12-31T16:24:44.316+0000 imported 25462 documents

mongo > db.audit1.countDocuments({})

Of course, this works, but I prefer an automated solution, so I looked at available options for live-streaming the logs.


The first thing that looked useful is the ability to send the audit log directly to syslog instead of a file. Knowing that both rsyslog, as well as syslog-ng, have MongoDB output modules, it felt like a very easy approach. So I installed the rsyslog-mongodb module package on my test Ubuntu VM with running Percona Server for MySQL, configured audit log with:

audit_log_handler = syslog
audit_log_format = JSON

Rsyslog (version 8.2) example configuration with:

# cat /etc/rsyslog.d/49-ship-syslog.conf
db="auditlogs" collection="mysql_node1_log")

This worked, however, inserted documents looked like this:

mongo > db.mysql_node1_log.findOne().pretty()
"_id" : ObjectId("5fece941f17f487c7d1d158b"),
"msg" : " {\"audit_record\":{\"name\":\"Connect\",\"record\":\"7_1970-01-01T00:00:00\",\"timestamp\":\"2020-12-30T20:55:29Z\",\"connection_id\":\"9\",\"status\":0,\"user\":\"root\",\"priv_user\":\"root\",\"os_login\":\"root\",\"proxy_user\":\"\",\"host\":\"localhost\",\"ip\":\"\",\"db\":\"\"}}"

Basically, because of syslog escaping the double quote symbols, the whole audit record appears as a single string inside MongoDB collection, instead of a JSON object. No matter what I tried, like custom templates and property values in rsyslog, I could not disable escaping. Therefore, although feeding MongoDB with audit logs works this way, it becomes pretty useless when it comes to analyzing the logs later. The same issue applies to syslog-ng and the syslog-ng-mod-mongodb module. And since MongoDB does not offer before-insert triggers, I could not easily “fix” the inserted data on the fly.

Fluentd For The Rescue!

This forced me to look for alternative solutions. One of them would be using FIFO file and tail the audit log continuously to feed it, and then read from it to insert logs to mongodb. I wanted a more robust way, though, and decided to try Fluentd instead. It was created as a versatile log collector machine, highly flexible, prepared to work with many different applications out of the box, but most importantly, it is an open source project and speaks JSON natively. Making it to do the job I wanted turned out to be easier than I expected.

Here is what I did:

  • Installed the Fluentd package (I chose td-agent variant here for an even easier user experience)
  • Installed MongoDB plugin for Fluentd with (don’t use the usual ‘gem install’ here):
td-agent-gem install fluent-plugin-mongo

  • Configured audit log as a source and output directive for MongoDB:
# cat /etc/td-agent/td-agent.conf
 @type tail
 path /var/lib/mysql/audit.log
 pos_file /var/log/td-agent/audit.access_log.pos
  @type json
 tag mongo.audit.log
<match mongo.audit.log>
 @type mongo
 database auditlogs #(required)
 collection audit_log #(optional; default="untagged")
 capped_size 100m
 host #(optional; default="localhost")
 port 27017 #(optional; default=27017)
 user percona
 password P3rc0n4
  flush_interval 1s

  • Added the user used by Fluentd to mysql group to allow it to read from the audit log:
# id td-agent
uid=114(td-agent) gid=121(td-agent) groups=121(td-agent)
# usermod -a -G mysql td-agent
# id td-agent
uid=114(td-agent) gid=121(td-agent) groups=121(td-agent),120(mysql)

audit_log_handler = file
audit_log_format = JSON
audit_log_file = audit.log
audit_log_rotate_on_size = 10M
audit_log_rotations = 3

  • Restarted both services to apply changes:
# systemctl restart mysql
# systemctl restart td-agent

  • Checked the Fluentd log to see if it reads the audit log as expected, also for when Percona Server for MySQL rotates it:
# tail -f /var/log/td-agent/td-agent.log
2020-12-31 02:41:39 +0000 [info]: adding match pattern="mongo.audit.log" type="mongo"
2020-12-31 02:41:40 +0000 [info]: #0 following tail of /var/lib/mysql/audit.log
2020-12-31 02:52:14 +0000 [info]: #0 detected rotation of /var/lib/mysql/audit.log; waiting 5 seconds
2020-12-31 02:52:14 +0000 [info]: #0 following tail of /var/lib/mysql/audit.log

  • Ran sysbench against MySQL instance and verified the new collection in MongoDB gets updated:
mongo > db.audit_log.countDocuments({})

mongo > db.audit_log.stats()
 "ns" : "auditlogs.audit_log",
 "size" : 104857293,
 "count" : 281245,
 "avgObjSize" : 372,
 "storageSize" : 26357760,
 "capped" : true,
 "max" : -1,
 "maxSize" : 104857600,

Yay, it works like a charm! Not only are the audit logs rotated automatically on Percona Server for MySQL, but also on MongoDB the destination collection size cap works as well, so I am safe when it comes to disk space on both hosts!

Here, there is a little caveat – if for some reason you drop the destination collection manually on MongoDB, incoming inserts will make it re-created without the capped setting! Therefore, either let the collection be created by Fluentd on its service startup or create it manually with a capped setting, and don’t drop it later.

Now, we can try some example aggregations to get some useful audit stats:

mongo > db.audit_log.aggregate([ { $group: { _id: {name: "$audit_record.name", command: "$audit_record.command_class"}, count: {$sum:1}}}, { $sort: {count:-1}} ])
{ "_id" : { "name" : "Execute", "command" : "error" }, "count" : 267086 }
{ "_id" : { "name" : "Query", "command" : "begin" }, "count" : 14054 }
{ "_id" : { "name" : "Close stmt", "command" : "error" }, "count" : 76 }
{ "_id" : { "name" : "Query", "command" : "show_variables" }, "count" : 7 }
{ "_id" : { "name" : "Query", "command" : "select" }, "count" : 6 }
{ "_id" : { "name" : "Quit" }, "count" : 5 }
{ "_id" : { "name" : "Query", "command" : "show_tables" }, "count" : 4 }
{ "_id" : { "name" : "Init DB", "command" : "error" }, "count" : 2 }
{ "_id" : { "name" : "Field List", "command" : "show_fields" }, "count" : 2 }
{ "_id" : { "name" : "Query", "command" : "show_databases" }, "count" : 2 }
{ "_id" : { "name" : "Connect" }, "count" : 1 }

mongo > db.audit_log.aggregate([ { $match: { "audit_record.status": {$gt: 0} } }, { $group: { _id: {command_class: "$audit_record.command_class", status: "$audit_record.status"}, count: {$sum:1}}}, { $sort: {count:-1}} ])
{ "_id" : { "command_class" : "error", "status" : 1049 }, "count" : 2 }
{ "_id" : { "command_class" : "show_tables", "status" : 1046 }, "count" : 2 }
{ "_id" : { "command_class" : "create_table", "status" : 1050 }, "count" : 2 }
{ "_id" : { "command_class" : "drop_table", "status" : 1051 }, "count" : 2 }
{ "_id" : { "command_class" : "drop_table", "status" : 1046 }, "count" : 2 }
{ "_id" : { "command_class" : "create_table", "status" : 1046 }, "count" : 1 }
{ "_id" : { "command_class" : "create_table", "status" : 1113 }, "count" : 1 }




Thinking About Deploying MongoDB? Read This First.

Deploying MongoDB

Deploying MongoDBAre you thinking about deploying MongoDB? Is it the right choice for you?

Choosing a database is an important step when designing an application. A wrong choice can have a negative impact on your organization in terms of development and maintenance. Also, the wrong choice can lead to poor performance.

Generally speaking, any kind of database can manage any kind of workload, but any database has specific workloads that fit better than others.

You don’t have to consider MongoDB just because it’s cool and there’s already a lot of companies using it. You need to understand if it fits properly with your workload and expectations. So, choose the right tool for the job.

In this article, we are going to discuss a few things you need to know before choosing and deploying MongoDB.

MongoDB Manages JSON-style Documents and Developers Appreciate That

The basic component of a MongoDB database is a JSON-style document. Technically it is BSON, which contains some extra datatypes (eg. datetime) that aren’t legit JSON.

We can consider the document the same as a record for a relational database. The documents are put into a collection, the same concept as a relational table.

JSON-style documents are widely used by a lot of programmers worldwide to implement web services, applications, and exchange data. Having a database that is able to manage that data natively is really effective.

MongoDB is often appreciated by developers because they can start using it without having specific knowledge about database administration and design and without studying a complex query language. Indeed, the MongoDB query language is also represented by JSON documents.

The developers can create, save, retrieve, and update their JSON-style documents at ease. Great! This leads usually to a significant reduction in development time.

MongoDB is Schemaless

Are you familiar with relational databases? For sure you are, as relational databases are used and studied for such a long time at school and at university. Relational databases are the most widely used in the market nowadays.

You know that a relational schema needs a predefined and fixed structure for the tables. Any time you add or change a column you need to run a DDL query and additional time is necessary to also change your application code to manage the new structure. In the case of a massive change that requires multiple column changes and/or the creation of new tables, the application changes could be impressive. MongoDB’s lack of schema enforcement means none of that is required. You just insert a document in a collection and that’s all. Let suppose that you have a collection with user data. If at some point you need to add for example the new “date_of_birth” field, you simply start to insert the new JSON documents with the additional field. That’s all. No need to change anything on the schema.

You can insert into the same collection even completely different JSON documents, representing different entities. Well, this is technically feasible, but not recommended, anyway.

MongoDB greatly shortens the cycle of application development for a non-technology reason – it removes the need to coordinate a schema change migration project with the DBA team. There is no need to wait until the DBA team does a QA dress-rehearsal and then the production release (with rollback plans) that, often as not, requires some production downtime.

MongoDB Has No Foreign Keys, Stored Procedures, or Triggers. Joins Are Supported, but Untypical.

The database design requires SQL queries to be able to join multiple tables on specific fields. Also, the database design may require foreign keys for assuring the consistency of the data and for running automatic changes on semantically connected fields.

What about stored procedures? They can be useful to embed into the database some application logic to simplify some tasks or to improve the security.

And what about triggers? They are useful to automatically “trigger” changes on the data based on specific events, like adding/changing/deleting a row. They help to manage the consistency of the data and, in some cases, to simplify the application code.

Well, none of them is available on MongoDB. So, be aware of that.

Note: to be honest, there’s an aggregation stage that can implement the same of a LEFT JOIN, but this is the only case.

How to survive without JOIN?

Managing JOINs must be done on your application code. If you need to join two collections, you need to read the first one, selects the join field and use it for querying the second collection, and so on. Well, this seems to be expensive in terms of application development, and also this could lead to more queries executed. Indeed it is, but the good news is that in many cases you don’t have to manage the joins at all.

Remember that MongoDB is a schemaless database; it doesn’t require normalization. If you are able to properly design your collections, you can embed and duplicate data into a single collection without the need of creating an additional collection. This way you won’t need to run any join because all the data you need is already into one collection only.

Foreign keys are not available, but as long as you can embed multiple documents into the same collection, you don’t really need them.

Stored procedures can be implemented easily as external scripts you can write in your preferred language. Triggers can be implemented externally the same way, but with the help of the Change Stream API feature connected to a collection.

If you have a lot of collections with referenced fields, you have to implement in your code a lot of joins or you have to do a lot of checks to assure consistency. This is possible but at a higher cost in terms of development. MongoDB could be the wrong choice in such a case.

MongoDB Replication and Sharding Are Easy to Deploy

MongoDB was natively designed not as a standalone application. It was designed instead to be a piece of a larger puzzle. A mongod server is able to work together with other mongod instances in order to implement replication and sharding efficiently and without the need for any additional third-party tool.

A Replica Set is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability by design. With caveats regarding potentially stale data, you also get read scalability for free. It should be the basis for all production deployments.

The Sharding Cluster is deployed as a group of several Replica Sets with the capability to split and distribute the data evenly on them. The Sharding Cluster provides write scalability in addition to redundancy, high availability, and read scalability. The sharding topology is suitable for the deployment of very large data sets. The number of shards you can add is, in theory, unlimited.

Both the topologies can be upgraded at any time by adding more servers and shards. More importantly, no changes are required for the application since each topology is completely transparent from the application perspective.

Finally, the deployment of such topologies is straightforward. Well, you need to spend some time in the beginning to understand a few basic concepts, but then, in a matter of a few hours, you can deploy even a very large sharded cluster. In the case of several servers, instead of doing everything manually, you can automatize a lot of things using Ansible playbooks or other similar tools.

Further readings:

Deploy a MongoDB Replica Set with Transport Encryption (Part 1)

MongoDB Sharding 101 Webinar

MongoDB Has Indexes and They Are Really Important

MongoDB allows you to create indexes on the JSON document’s fields. Indexes are used the same way as a relational database. They are useful to solve queries faster and to decrease the usage of machine resources: memory, CPU time, and disk IOPS.

You should create all the indexes that will help any of the regularly executed queries, updates, or deletes from your application.

MongoDB has a really advanced indexing capability. It provides TLL indexes, GEO Spatial indexes, indexes on array elements, partial and sparse indexes. If you need more details about the available index types, you can take a look at the following articles:

MongoDB Index Types and MongoDB explain() (part 1)

Using Partial and Sparse Indexes in MongoDB

Create all the indexes you need for your collections. They will help you a lot to improve the overall performance of your database.

MongoDB is Memory Intensive

MongoDB is memory intensive; it needs a lot. This is the same for many other databases. Memory is the most important resource, most of the time.

MongoDB uses the RAM for caching the most frequently and recently accessed data and indexes. The larger this cache, the better the overall performance will be, because MongoDB will be able to retrieve a lot of data faster. Also, MongoDB writes are only committed to memory before client confirmation is returned, at least by default. Writes to disk are done asynchronously – first to the journal file (typically within 50ms), and later into the normal data files (once per min).

The widely used storage engine used by MongoDB is WiredTiger. In the past there was MMAPv1, but it is no longer available on more recent versions. The WiredTiger storage engine uses an important memory cache (the WiredTiger Cache) for caching data and indexes.

Other than using the WTCache, MongoDB relies on the OS file system caches for accessing the disk pages. This is another important optimization, and significant memory may be required also for that.

In addition, MongoDB needs memory for managing other stuff like client connections, in-memory sortings, saving temporary data when executing aggregation pipelines, and other minor things.

In the end, be prepared to provide enough memory to MongoDB.

But how much memory should I need? The rule of thumb is evaluating the “working set” size.

The “working set” is the amount of data that is most frequently requested by your application. Usually, an application needs a limited amount of data, it doesn’t need to read the entire data set during normal operations. For example, in the case of time-series data, most probably you need to read only the last few hours or the last few day’s entries. Only on a few occasions will you need to read legacy data. In such a case, your working set is the one that can store just a few days of data.

Let’s suppose your data set is 100GB and you evaluated your working set is around 20%, then you need to provide at least 20GB for the WTCache.

Since MongoDB uses by default 50% of the RAM for the WTCache (we usually suggest not to increase it significantly), you should provide around 40GB of memory in total for your server.

Every case is different and sometimes it could be difficult to evaluate correctly the working set size. Anyway, the main recommendation is that you should spend a significant part of your budget to provide the larger memory you can. For sure, this is will be beneficial for MongoDB.

What Are the Suitable Use Cases for MongoDB?

Actually, a lot. I have seen MongoDB deployed on a wide variety of environments.

For example, MongoDB is suitable for:

  • events logging
  • content management
  • gaming applications
  • payment applications
  • real-time analytics
  • Internet Of Things applications
  • content caching
  • time-series data applications

And many others.

We can say that you can use MongoDB basically for everything, it is a general-purpose database. The key point is instead the way you use it.

For example, if you plan to use MongoDB the same way as a relational database, with data normalized, a lot of collections around, and a myriad of joins to be managed by the application, then MongoDB is not the right choice for sure. Use a relational database.

The best way to use MongoDB is to adhere to a few best practices and modeling the collections keeping in mind some basic rules like embedding documents instead of creating multiple referenced collections.

Percona Server for MongoDB: The Enterprise-Class Open Source Alternative

Percona develops and deploys its own open source version of MongoDB: the Percona Server for MongoDB (PSMDB).

PSMDB is a drop-in replacement for MongoDB Community and it is 100% compatible. The great advantage provided by PSMDB is that you can get enterprise-class features for free, like:

  • encryption at the rest
  • audit logging
  • LDAP Authentication
  • LDAP Authorization
  • Log redaction
  • Kerberos Authentication
  • Hot backup
  • in-memory storage engine

Without PSMDB all these advanced features are available only in the MongoDB Enterprise subscription.

Please take a look at the following links for more details about PSMDB:

Percona Server for MongoDB Feature Comparison

Percona Server for MongoDB

Remember you can get in touch with Percona at any time for any details or for getting help.


Let’s have a look at the following list with the more important things you need to check before choosing MongoDB as the backend database for your applications. The three colored flags indicate if MongoDB is a good choice: red means it’s not, orange means it could be a good choice but with some limitations or potential bottlenecks, green means it’s very good.

Your applications primarily deal with JSON documents
Your data has unpredictable and frequent schema changes during the time
You have several collections with a lot of external references for assuring consistency and the majority of the queries need joins
You need to replicate stored procedures and triggers you have in your relational database
You need HA and read scalability
You need to scale your data to a very large size
You need to scale because of a huge amount of writes


And finally, remember the following:

Take a look at Percona Server for MongoDB 


Harness snags $85M Series C on $1.7B valuation as revenue grows 3x

Harness, the startup that wants to create a suite of engineering tools to give every company the kind of technological reach that the biggest companies have, announced an $85 million Series C today on a $1.7 billion valuation.

Today’s round comes after 2019’s $60 million Series B, which had a $500 million valuation, showing a company rapidly increasing in value. For a company that launched just three years ago, this is a fairly remarkable trajectory.

Alkeon Capital led the round with help from new investors Battery Ventures, Citi Ventures, Norwest Venture Partners, Sorenson Capital and Thomvest Ventures. The startup also revealed a previously unannounced $30 million B-1 round raised after the $60 million round, bringing the total raised to date to $195 million.

Company founder and CEO Jyoti Bansal previously founded AppDynamics, which he sold to Cisco in 2017 for $3.7 billion. With his track record, investors came looking for him this round. It didn’t hurt that revenue grew almost 3x last year.

“The business is doing very well, so the investor community has been proactively reaching out and trying to invest in us. We were not actually planning to raise a round until later this year. We had enough capital to get through that, but there were a lot of people wanting to invest,” Bansal told me.

In fact, he said there is so much investor interest that he could have raised twice as much, but didn’t feel a need to take on that much capital at this time. “Overall, the investor community sees the value in developer tools and the DevOps market. There are so many big public companies now in that space that have gone out in the last three to five years and that has definitely created even more validation of this space,” he said.

Bansal says that he started the company with the goal of making every company as good as Google or Facebook when it comes to engineering efficiency. Since most companies lack the engineering resources of these large companies, that’s a tall task, but one he thinks he can solve through software.

The company started by building a continuous delivery module. A cloud cost-efficiency module followed. Last year the company bought open-source continuous integration company Drone.io and they are working on building that into the platform now, with it currently in beta. There are additional modules on the product roadmap coming this year, according to Bansal.

As the company continued to grow revenue and build out the platform in 2020, it also added a slew of new employees, growing from 200 to 300 during the pandemic. Bansal says that he has plans to add another 200 by the end of this year. Harness has a reputation of being a good place to work, recently landing on Glassdoor’s best companies list.

As an experienced entrepreneur, Bansal takes building a diverse company with a welcoming culture very seriously. “Yes, you have to provide equal opportunity and make sure that you are open to hiring people from diverse backgrounds, but you have to be more proactive about it in the sense that you have to make sure that your company environment and company culture feels very welcoming to everyone,” he said.

It’s been a difficult time building a company during the pandemic, adding so many new employees, and finding a way to make everyone feel welcome and included. Bansal says he has actually seen productivity increase during the pandemic, but now has to guard against employee burnout.

He says that people didn’t know how to draw boundaries when working at home. One thing he did was introduce a program to give everyone one Friday a month off to recharge. The company also recently announced it would be a “work from anywhere” company post-COVID, but Bansal still plans on having regional offices where people can meet when needed.


Webinar January 28: Tuning PostgreSQL for High Performance and Optimization

Tuning PostgreSQL webinar

Tuning PostgreSQL webinarPostgreSQL is one of the leading open-source databases, but, out of the box, the default PostgreSQL configuration is not tuned for any workload. Thus, any system with the least resources can run it. PostgreSQL does not give optimum performance on high permanence machines because it is not using all available resources. PostgreSQL provides a system where you can tune your database according to your workload and machine specifications. In addition to PostgreSQL, we can also tune our Linux box so that the database load can work optimally.

In this webinar on Tuning PostgreSQL for High Performance and Optimization, we will learn how to tune PostgreSQL and we’ll see the results of that tuning. We will also touch on tuning some Linux kernel parameters.

Please join Ibrar Ahmed, Percona Software Engineer, on January 28, 2021, at 1 pm EST as he presents his webinar “Tuning PostgreSQL for High Performance and Optimization.”

Register for Webinar

If you can’t attend, sign up anyway and we’ll send you the slides and recording afterward.


Germany’s Xentral nabs $20M led by Sequoia to help online-facing SMBs run back offices better

Small enterprises remain one of the most underserved segments of the business market, but the growth of cloud-based services — easier to buy, easier to provision — has helped that change in recent years. Today, one of the more promising startups out of Europe building software to help SMEs run online businesses is announcing some funding to better tap into both the opportunity to build these services, and to meet a growing demand from the SME segment.

Xentral, a German startup that develops enterprise resource planning software covering a variety of back-office functions for the average online small business, has picked up a Series A of $20 million.

The company’s platform today covers services like order and warehouse management, packaging, fulfillment, accounting and sales management, and the majority of its 1,000 customers are in Germany — they include the likes of direct-to-consumer brands like YFood, KoRo, the Nu Company and Flyeralarm.

But Benedikt Sauter, the co-founder and CEO of Xentral, said the ambition is to expand into the rest of Europe, and eventually other geographies, and to fold in more services to its ERP platform, such as a more powerful API to allow customers to integrate more services — for example in cases where a business might be selling on their own site, but also Amazon, eBay, social platforms and more — to bring their businesses to a wider market.

Mainly, he said, the startup wants “to build a better ecosystem to help our customers run their own businesses better.”

The funding is being led by Sequoia Capital, with Visionaires Club (a B2B-focused VC out of Berlin) also participating.

The deal is notable for being the prolific, high-profile VC’s first investment in Europe since officially opening for business in the region. (Sequoia has backed a number of startups in Europe before this, including Graphcore, Klarna, Tessian, Unity, UiPath, n8n and Evervault — but all of those deals were done from afar.)

Augsburg-based Xentral has been around as a startup since 2018, and “as a startup” is the operative phrase here.

Sauter and his co-founder Claudia Sauter (who is also his co-founder in life: she is his wife) built the early prototype for the service originally for themselves.

The pair were running a business of their own — a hardware company they founded in 2008, selling not nails, hammers and wood, but circuit boards they designed, along with other hardware to build computers and other connected objects. Around 2013, as the business was starting to pick up steam, they decided that they really needed better tools to manage everything at the backend so that they would have more time to build their actual products.

But Bene Sauter quickly discovered a problem in the process: smaller businesses may have Shopify and its various competitors to help manage e-commerce at the front end, but when it came to the many parts of the process at the backend, there really wasn’t a single, easy solution (remember this was eight years ago, at a time before the Shopifys of the world were yet to expand into these kinds of tools). Being of a DIY and technical persuasion — Sauter had studied hardware engineering at university — he decided that he’d try to build the tools that he wanted to use.

The Sauters used those tools for years, until without much outbound effort, they started to get some inbound interest from other online businesses to use the software, too. That led to the Sauters balancing both their own hardware business and selling the software on the side, until around 2017/2018 when they decided to wind down the hardware operation and focus on the software full time. And from then, Xentral was born. It now has, in addition to 1,000 customers, some 65 employees working on developing the platform.

The focus with Xentral is to have a platform that is easy to implement and use, regardless of what kind of SME you might be as long as you are selling online. But even so, Sauter pointed out that the other common thread is that you need at least one person at the business who champions and understands the value of ERP. “It’s really a mindset,” he said.

The challenge with Xentral in that regard will be to see how and if they can bring more businesses to the table and tap into the kinds of tools that it provides, at the same time that a number of other players also eye up the same market. (Others in the same general category of building ERP for small businesses include online payments provider Sage, NetSuite and Acumatica.) ERP overall is forecast to become a $49.5 billion market by 2025.

Sequoia and its new partner in Europe, Luciana Lixandru — who is joining Xentral’s board along with Visionaries’ Robert Lacher — believe however that there remains a golden opportunity to build a new kind of provider from the ground up and out of Europe specifically to target the opportunity in that region.

“I see Xentral becoming the de facto platform for any SMEs to run their businesses online,” she said in an interview. “ERP sounds a bit scary especially because it makes one think of companies like SAP, long implementation cycles, and so on. But here it’s the opposite.” She describes Xentral as “very lean and easy to use because you an start with one module and then add more. For SMEs it has to be super simple. I see this becoming like the Shopify for ERP.”

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com