May
05
2021
--

Cymulate nabs $45M to test and improve cybersecurity defenses via attack simulations

With cybercrime on course to be a $6 trillion problem this year, organizations are throwing ever more resources at the issue to avoid being a target. Now, a startup that’s built a platform to help them stress-test the investments that they have made into their security IT is announcing some funding on the back of strong demand from the market for its tools.

Cymulate, which lets organizations and their partners run machine-based attack simulations on their networks to determine vulnerabilities and then automatically receive guidance around how to fix what is not working well enough, has picked up $45 million, funding that the startup — co-headquartered in Israel and New York — will be using to continue investing in its platform and to ramp up its operations after doubling its revenues last year on the back of a customer list that now numbers 300 large enterprises and mid-market companies, including the Euronext stock exchange network as well as service providers such as NTT and Telit.

London-based One Peak Partners is leading this Series C, with previous investors Susquehanna Growth Equity (SGE), Vertex Ventures Israel, Vertex Growth and Dell Technologies Capital also participating.

According to Eyal Wachsman, the CEO and co-founder, Cymulate’s technology has been built not just to improve an organization’s security, but an automated, machine learning-based system to better understand how to get the most out of the security investments that have already been made.

“Our vision is to be the largest cybersecurity ‘consulting firm’ without consultants,” he joked.

The valuation is not being disclosed, but as some measure of what is going on, David Klein, managing partner at One Peak, said in an interview that he expects Cymulate to hit a $1 billion valuation within two years at the rate it’s growing and bringing in revenue right now. The startup has now raised $71 million, so it’s likely the valuation is in the mid-hundreds of millions. (We’ll continue trying to get a better number to have a more specific data point here.)

Cymulate — pronounced “sigh-mulate”, like the “cy” in “cyber” and a pun of “simulate”) is cloud-based but works across both cloud and on-premises environments and the idea is that it complements work done by (human) security teams both inside and outside of an organization, as well as the security IT investments (in terms of software or hardware) that they have already made.

“We do not replace — we bring back the power of the expert by validating security controls and checking whether everything is working correctly to optimize a company’s security posture,” Wachsman said. “Most of the time, we find our customers are using only 20% of the capabilities that they have. The main idea is that we have become a standard.”

The company’s tools are based in part on the MITRE ATT&CK framework, a knowledge base of threats, tactics and techniques used by a number of other cybersecurity services, including a number of others building continuous validation services that compete with Cymulate. These include the likes of FireEye, Palo Alto Networks, Randori, Khosla-backed AttackIQ and many more.

Although Cymulate is optimized to help customers better use the security tools they already have, it is not meant to replace other security apps, Wachsman noted, even if the by-product might become buying fewer of those apps in the future.

“I believe my message every day when talking with security experts is to stop buying more security products,” he said in an interview. “They won’t help defend you from the next attack. You can use what you’ve already purchased as long as you configure it well.”

In his words, Cymulate acts as a “black box” on the network, where it integrates with security and other software (it can also work without integrating, but integrations allow for a deeper analysis). After running its simulations, it produces a map of the network and its threat profile, an executive summary of the situation that can be presented to management and a more technical rundown, which includes recommendations for mitigations and remediations.

Alongside validating and optimising existing security apps and identifying vulnerabilities in the network, Cymulate also has built special tools to fit different kinds of use cases that are particularly relevant to how businesses operate today. They include evaluating remote working deployments, the state of a network following an M&A process, the security landscape of an organization that links up with third parties in supply chain arrangements, how well an organization’s security architecture is meeting (or potentially conflicting) with privacy and other kinds of regulatory compliance requirements, and it has built a “purple team” deployment, where in cases where security teams do not have the resources for running separate “red teams” to stress test something, blue teams at the organization can use Cymulate to build a machine learning-based “team” to do this.

The fact that Cymulate has built the infrastructure to run all of these processes speaks to a lot of potential of what more it could build, especially as our threat landscape and how we do business both continue to evolve. Even as it is, though, the opportunity today is a massive one, with Gartner estimating that some $170 billion will be spent on information security by enterprises in 2022. That’s one reason why investors are here, too.

“The increasing pace of global cyber security attacks has resulted in a crisis of trust in the security posture of enterprises and a realization that security testing needs to be continuous as opposed to periodic, particularly in the context of an ever-changing IT infrastructure and rapidly evolving threats. Companies understand that implementing security solutions is not enough to guarantee protection against cyber threats and need to regain control,” said Klein, in a statement. “We expect Cymulate to grow very fast,” he told me more directly.

May
04
2021
--

Talking Drupal #293 – Automatic Updates

Today we talk with Drupal Association CTO Tim Lehnen and Acquia Principal Software Engineer Ted Bowman about the Automatic Updates Initiative.

www.talkingdrupal.com/293

Topics

  • Nic – new mic
  • John – Drupal Event Organizers Group
  • Stephen – Drutiny
  • Tim – Track days
  • Ted – Old sheet music cabinet into stereo/turntable cabinet
  • Mike – Developer Programs Engineer at Pantheon
  • What is the initiative about
  • The Update Framework (TUF)
  • Roles and key contributors
  • Who are automatic updates for
  • What gets updated
  • Key Challenges
  • Project roadmap

Resources

Strategic Initiatives: Automated Updates

Drutiny

Tim’s Youtube Page

The Updating Framework

Project messaging in core initiative

Project Announce

Project messaging channel in core

Guests

Tim Lehnen  @TimLehnen  www.drupal.org/u/hestenet

Ted Bowman  @tedbow  tedbow.com

Hosts

Stephen Cross – www.stephencross.com @stephencross

John Picozzi – www.oomphinc.com @johnpicozzi

Nic Laflin – www.nLighteneddevelopment.com @nicxvan

Mike MIles   www.mike-miles.com   @mikemiles86

May
04
2021
--

Inconsistent Voting in Percona XtraDB Cluster

Cluster Error Voting

AKA Cluster Error Voting…

Cluster Error VotingWhat is Cluster Error Voting (CEV)?

“Cluster Error Voting is a new feature implemented by Alexey Yurchenko, and it is a protocol for nodes to decide how the cluster will react to problems in replication. When one or several nodes have an issue applying an incoming transaction(s) (e.g., suspected inconsistency), this new feature helps. In a 5-node cluster, if 2-nodes fail to apply the transaction, they get removed, and a DBA can go in to fix what went wrong so that the nodes can rejoin the cluster. (Seppo Jaakola)”

This feature was ported to Percona XtraDB Cluster (PXC) in version 8.0.21. As indicated above, it is about increasing the resilience of the cluster, especially when TWO nodes fail to operate and may drop from the cluster abruptly. The protocol is activated in a cluster with any number of nodes.

Before CEV, if a node has a problem/error during a transaction, the node having the issue would report the error in his own log and exit the cluster:

2021-04-23T15:18:38.568903Z 11 [ERROR] [MY-010584] [Repl] Slave SQL: Could not execute Write_rows event on table test.test_voting; Duplicate entry '21' for key 'test_voting.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 0, Error_code: MY-001062
2021-04-23T15:18:38.568976Z 11 [Warning] [MY-000000] [WSREP] Event 3 Write_rows apply failed: 121, seqno 16
2021-04-23T15:18:38.569717Z 11 [Note] [MY-000000] [Galera] Failed to apply write set: gtid: 224fddf7-a43b-11eb-84d5-2ebf2df70610:16 server_id: d7ae67e4-a43c-11eb-861f-8fbcf4f1cbb8 client_id: 40 trx_id: 115 flags: 3
2021-04-23T15:18:38.575439Z 11 [Note] [MY-000000] [Galera] Closing send monitor...
2021-04-23T15:18:38.575578Z 11 [Note] [MY-000000] [Galera] Closed send monitor.
2021-04-23T15:18:38.575647Z 11 [Note] [MY-000000] [Galera] gcomm: terminating thread
2021-04-23T15:18:38.575737Z 11 [Note] [MY-000000] [Galera] gcomm: joining thread
2021-04-23T15:18:38.576132Z 11 [Note] [MY-000000] [Galera] gcomm: closing backend
2021-04-23T15:18:38.577954Z 11 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,3206d174,5)
memb {
	727c277a,1
	}
joined {
	}
left {
	}
partitioned {
	3206d174,1
	d7ae67e4,1
	}
)
2021-04-23T15:18:38.578109Z 11 [Note] [MY-000000] [Galera] PC protocol downgrade 1 -> 0
2021-04-23T15:18:38.578158Z 11 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view ((empty))
2021-04-23T15:18:38.578640Z 11 [Note] [MY-000000] [Galera] gcomm: closed
2021-04-23T15:18:38.578747Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1

While the other nodes will “just” report the node as out of the view:

2021-04-23T15:18:38.561402Z 0 [Note] [MY-000000] [Galera] forgetting 727c277a (tcp://10.0.0.23:4567)
2021-04-23T15:18:38.562751Z 0 [Note] [MY-000000] [Galera] Node 3206d174 state primary
2021-04-23T15:18:38.570411Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(PRIM,3206d174,6)
memb {
	3206d174,1
	d7ae67e4,1
	}
joined {
	}
left {
	}
partitioned {
	727c277a,1
	}
)
2021-04-23T15:18:38.570679Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk
2021-04-23T15:18:38.574592Z 0 [Note] [MY-000000] [Galera] forgetting 727c277a (tcp://10.0.0.23:4567)
2021-04-23T15:18:38.574716Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2021-04-23

With CEV, we have a different process. Let us review it with images first.

Let us start with a cluster…

3 Nodes, where only one works as Primary.

Primary writes and, as expected, writesets are distributed on all nodes.

insert into test_voting values(null,REVERSE(UUID()), NOW()); <-- Few times

DC1-1(root@localhost) [test]>select * from test_voting;
+----+--------------------------------------+---------------------+
| id | what                                 | when                |
+----+--------------------------------------+---------------------+
|  3 | 05de43720080-938a-be11-305a-6d135601 | 2021-04-24 14:43:34 |
|  6 | 05de43720080-938a-be11-305a-7eb60711 | 2021-04-24 14:43:36 |
|  9 | 05de43720080-938a-be11-305a-6861c221 | 2021-04-24 14:43:37 |
| 12 | 05de43720080-938a-be11-305a-d43f0031 | 2021-04-24 14:43:38 |
| 15 | 05de43720080-938a-be11-305a-53891c31 | 2021-04-24 14:43:39 |
+----+--------------------------------------+---------------------+
5 rows in set (0.00 sec)

Some inexperienced DBA does a manual operation on a secondary using the very unsafe feature wsrep_on…

And then, by mistake or because he did not understand what he is doing…

insert into test_voting values(17,REVERSE(UUID()), NOW()); <-- with few different ids

At the end of the operation of the Secondary node, he will have:

DC1-2(root@localhost) [test]>select * from test_voting;
+----+--------------------------------------+---------------------+
| id | what                                 | when                |
+----+--------------------------------------+---------------------+
|  3 | 05de43720080-938a-be11-305a-6d135601 | 2021-04-24 14:43:34 |
|  6 | 05de43720080-938a-be11-305a-7eb60711 | 2021-04-24 14:43:36 |
|  9 | 05de43720080-938a-be11-305a-6861c221 | 2021-04-24 14:43:37 |
| 12 | 05de43720080-938a-be11-305a-d43f0031 | 2021-04-24 14:43:38 |
| 15 | 05de43720080-938a-be11-305a-53891c31 | 2021-04-24 14:43:39 |
| 16 | 05de43720080-a39a-be11-405a-82715600 | 2021-04-24 14:50:17 |
| 17 | 05de43720080-a39a-be11-405a-f9d62e22 | 2021-04-24 14:51:14 |
| 18 | 05de43720080-a39a-be11-405a-f5624662 | 2021-04-24 14:51:20 |
| 19 | 05de43720080-a39a-be11-405a-cd8cd640 | 2021-04-24 14:50:23 |
+----+--------------------------------------+---------------------+

This is not in line with the rest of the cluster that still has the previous data. Then our guy put the node back:

At this point, the Primary does another insert in that table and:

Houston, we have a problem! 

The secondary node already has the entry with that ID and cannot perform the insert:

2021-04-24T13:52:51.930184Z 12 [ERROR] [MY-010584] [Repl] Slave SQL: Could not execute Write_rows event on table test.test_voting; Duplicate entry '18' for key 'test_voting.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 0, Error_code: MY-001062
2021-04-24T13:52:51.930295Z 12 [Warning] [MY-000000] [WSREP] Event 3 Write_rows apply failed: 121, seqno 4928120

But instead of exit from the cluster, it will raise a verification through voting:

2021-04-24T13:52:51.932774Z 0 [Note] [MY-000000] [Galera] Member 0(node2) initiates vote on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120,878ded7898c83a72:  Duplicate entry '18' for key 'test_voting.PRIMARY', Error_code: 1062;
2021-04-24T13:52:51.932888Z 0 [Note] [MY-000000] [Galera] Votes over ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120:
   878ded7898c83a72:   1/3
Waiting for more votes.
2021-04-24T13:52:51.936525Z 0 [Note] [MY-000000] [Galera] Member 1(node3) responds to vote on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120,0000000000000000: Success
2021-04-24T13:52:51.936626Z 0 [Note] [MY-000000] [Galera] Votes over ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120:
   0000000000000000:   1/3
   878ded7898c83a72:   1/3
Waiting for more votes.
2021-04-24T13:52:52.003615Z 0 [Note] [MY-000000] [Galera] Member 2(node1) responds to vote on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120,0000000000000000: Success
2021-04-24T13:52:52.003722Z 0 [Note] [MY-000000] [Galera] Votes over ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120:
   0000000000000000:   2/3
   878ded7898c83a72:   1/3
Winner: 0000000000000000

As you can see, each node informs the cluster about the success or failure of the operation, and the majority wins.

Once the majority had identified the operation was legit, as such, the node that asked for the voting will need to get out from the cluster:

2021-04-24T13:52:52.038510Z 12 [ERROR] [MY-000000] [Galera] Inconsistency detected: Inconsistent by consensus on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120
	 at galera/src/replicator_smm.cpp:process_apply_error():1433
2021-04-24T13:52:52.062666Z 12 [Note] [MY-000000] [Galera] Closing send monitor...
2021-04-24T13:52:52.062750Z 12 [Note] [MY-000000] [Galera] Closed send monitor.
2021-04-24T13:52:52.062796Z 12 [Note] [MY-000000] [Galera] gcomm: terminating thread
2021-04-24T13:52:52.062880Z 12 [Note] [MY-000000] [Galera] gcomm: joining thread
2021-04-24T13:52:52.063372Z 12 [Note] [MY-000000] [Galera] gcomm: closing backend
2021-04-24T13:52:52.085853Z 12 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,65a111c6-bb0f,23)
memb {
	65a111c6-bb0f,2
	}
joined {
	}
left {
	}
partitioned {
	aae38617-8dd5,2
	dc4eaa39-b39a,2
	}
)
2021-04-24T13:52:52.086241Z 12 [Note] [MY-000000] [Galera] PC protocol downgrade 1 -> 0
2021-04-24T13:52:52.086391Z 12 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view ((empty))
2021-04-24T13:52:52.150106Z 12 [Note] [MY-000000] [Galera] gcomm: closed
2021-04-24T13:52:52.150340Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1

It is also nice to notice that now we have a decent level of information about what happened in the other nodes; the log below is from the Primary:

2021-04-24T13:52:51.932829Z 0 [Note] [MY-000000] [Galera] Member 0(node2) initiates vote on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120,878ded7898c83a72:  Duplicate entry '18' for key 'test_voting.PRIMARY', Error_code: 1062;
2021-04-24T13:52:51.978123Z 0 [Note] [MY-000000] [Galera] Votes over ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120:
…<snip>
2021-04-24T13:52:51.981647Z 0 [Note] [MY-000000] [Galera] Votes over ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120:
   0000000000000000:   2/3
   878ded7898c83a72:   1/3
Winner: 0000000000000000
2021-04-24T13:52:51.981887Z 11 [Note] [MY-000000] [Galera] Vote 0 (success) on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120 is consistent with group. Continue.
2021-04-24T13:52:52.064685Z 0 [Note] [MY-000000] [Galera] declaring aae38617-8dd5 at tcp://10.0.0.31:4567 stable
2021-04-24T13:52:52.064885Z 0 [Note] [MY-000000] [Galera] forgetting 65a111c6-bb0f (tcp://10.0.0.21:4567)
2021-04-24T13:52:52.066916Z 0 [Note] [MY-000000] [Galera] Node aae38617-8dd5 state primary
2021-04-24T13:52:52.071577Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(PRIM,aae38617-8dd5,24)
memb {
	aae38617-8dd5,2
	dc4eaa39-b39a,2
	}
joined {
	}
left {
	}
partitioned {
	65a111c6-bb0f,2
	}
)
2021-04-24T13:52:52.071683Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk
2021-04-24T13:52:52.075293Z 0 [Note] [MY-000000] [Galera] forgetting 65a111c6-bb0f (tcp://10.0.0.21:4567)
2021-04-24T13:52:52.075419Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2

At this point, a DBA can start to investigate and manually fix the inconsistency and have the node rejoin the cluster. In the meantime, the rest of the cluster continues to operate:

+----+--------------------------------------+---------------------+
| id | what                                 | when                |
+----+--------------------------------------+---------------------+
|  3 | 05de43720080-938a-be11-305a-6d135601 | 2021-04-24 14:43:34 |
|  6 | 05de43720080-938a-be11-305a-7eb60711 | 2021-04-24 14:43:36 |
|  9 | 05de43720080-938a-be11-305a-6861c221 | 2021-04-24 14:43:37 |
| 12 | 05de43720080-938a-be11-305a-d43f0031 | 2021-04-24 14:43:38 |
| 15 | 05de43720080-938a-be11-305a-53891c31 | 2021-04-24 14:43:39 |
| 18 | 05de43720080-938a-be11-405a-d02c7bc5 | 2021-04-24 14:52:51 |
+----+--------------------------------------+---------------------+

Conclusion

Cluster Error Voting (CEV) is a nice feature to have. It helps us better understand what goes wrong and increases the stability of the cluster, and with the voting has a better way to manage the node expulsion.

Another aspect is visibility; never underestimate the fact information is available also on other nodes. Having it available on multiple nodes may help investigations if the log on the failing node gets lost (for any reason).

We still do not have active tuple certification, but it is a good step, especially given the history we have seen of data drift in PXC/Galera in these 12 years of utilization.

My LAST comment is that while I agree WSREP_ON can be a very powerful tool in the hands of experts, as indicated in my colleague’s blog How to Perform Compatible Schema Changes in Percona XtraDB Cluster (Advanced Alternative). That option remains DANGEROUS, and you should never use it UNLESS your name is Przemys?aw Malkowski, and you really know what you are doing.

Great MySQL to everybody!

References

https://www.percona.com/doc/percona-xtradb-cluster/8.0/release-notes/Percona-XtraDB-Cluster-8.0.21-12.1.html

Galera Clustering in MariaDB 10.5 and beyond – Seppo Jaakola – MariaDB Server Fest 2020

 

Register for Percona Live ONLINE
A Virtual Event about Open Source Databases

May
04
2021
--

Evening Fund debuts with $2M micro fund focused on investments between $50K and $100K

We tend to think of venture capital in tens or hundreds of millions, even billions of dollars, so it’s refreshing to find Evening Fund, a new $2 million micro fund that focuses on small investments between $50,000 and $100,000 as it seeks to help young startups with early funding.

The new fund was launched by Kat Orekhova and Rapha Danilo. Orekhova, who started her career as a math professor, is a former Facebook data scientist who has been dabbling in angel investing and working with young startups for awhile now. They call it Evening Fund because they work as founders by day and investors by night.

She says that she wanted to create something more formal to help early-stage startups get off the ground and has help from limited partners that include Sarah Smith at Bain Capital, Lee Linden, general partner at Quiet Capital and a long list of tech industry luminaries.

Orekhova says she and her partner invest small sums of money in B2B SaaS companies, which are pre-seed, seed and occasionally A rounds. They will invest in consumer here and there as well. She says one of their key value propositions is that they can help with more than just the money. “One way in which I think Rapha and I can really help our founders is that we give very specific, practical advice, not just kind of super high level,” she told me.

That could be something like how to hire your first designer where the founders may not even know what a designer does. “You’re figuring out ‘how do I hire my first designer?’ and ‘what does the designer even do?’ because most founders have never hired a designer before. So we give them extremely practical hands-on stuff like ‘here are the competencies’ or ‘what’s the difference between a graphic designer, a visual designer, a UX designer and a researcher,’ ” she said. They go so far as to give them a list of candidates to help them get going.

She says that she realized while she was at Facebook that she wanted to eventually start a company, so she began volunteering her time to work with companies going through Y Combinator. “I think a lot of people don’t know where to start, but in my case I looked at the YC list, found a company that I thought I could be helpful to. I reached out cold and said ‘Hey, I don’t want money. I don’t want equity. I just want to try to be helpful to you and see where that goes,’ ” she said.

That lead to scouting for startups for some larger venture capital firms and eventually dabbling in financing some of these startups that she was helping. Today’s announcement is the culmination of these years of work and the groundwork she laid to make herself familiar with how the startup ecosystem works.

The new firm already has its first investment under its belt, Dala, an AI-powered internal search tool that helps connect users to workplace knowledge that’s often locked in applications like Google Suite, Slack and Notion.

As though Evening isn’t enough to keep her and Danilo busy, they are also each working on their own startups. Orekhova wasn’t ready to share much on that just yet as her company remains in stealth.

May
04
2021
--

InnoDB File Growth Weirdness

InnoDB File Growth Weirdness

InnoDB File Growth WeirdnessThere is a common pattern in life, you often discover or understand things by accident. Many scientific discoveries fit such a description. In our database world, I was looking to see how BLOB/TEXT columns are allocated using overlay pages and I stumbled upon something interesting and unexpected. Let me present to you my findings, along with my attempt at explaining what is happening.

InnoDB Tablespaces

The first oddity I found is a bunch of free pages in each tablespace it is skipping. Here’s an example from a simple table with only an integer primary key and a char(32) column:

root@LabPS8_1:~/1btr# innodb_space -f /var/lib/mysql/test/t1.ibd space-extents
start_page  page_used_bitmap
0       	#####################################........................... <--- free pages
64      	################################################################
128     	################################################################
...

The innodb_space tool comes from the InnoDB ruby project of Jeremy Cole. If you want to explore InnoDB file formats, you need these tools. As you can see, the first extent has 27 free pages. These free pages are reserved for node pages (non-leaf) and will eventually be all used. At this point, I thought a table with 34 index pages, just starting to use the second extent for the leaf pages, would have 90 free pages (27 + 63) and use 2MB of disk space. While the previous statement proved to be true, I was up for quite a surprise.

InnoDB File Growth

To better illustrate the amount of free space available in an InnoDB data file, I decided to follow the evolution of the tablespace file size as I added index pages. The following figure shows my result.

Show the evolution of the size of an Innodb tablespace as data is added.

As I added rows, more leaf pages were allocated until the file segment of the leaf pages reached 32 pages. At this point, the table has 33 index pages, one root, and 32 leaves. The allocation of another page forces InnoDB to fully allocate the first extent and add the second one for the leaves. At this point, the size on the disk is 2MB. If we keep inserting rows, the following page allocation triggers InnoDB to allocate 7 reserved extents of 1MB each. At this point, the tablespace size on the disk reaches 9MB.

 

Register for Percona Live ONLINE
A Virtual Event about Open Source Databases

 

InnoDB uses the reserved extents for btree maintenance operations. They are not accounted for in the free space of the tablespace. Now, reserving 7 extents of 1MB each in a table containing only 560KB of data is pretty insane. At this point, the InnoDB tablespace has a size of 9MB on disk. This is extremely inefficient, about 8.4MB if just free space filled with “0”. Of course, as the table grows, the size impact of these reserved extents is diluted. The amount of reserved space will grow by about 1MB (1 extent) for every 100MB allocated.

This allocation of reserved extents is far from optimal, especially in a multi-tenants era where it is fairly common to see MySQL servers handling more than 100k tables. 100k tables, each with only 1MB of data in them will use 900GB of disk space. This phenomenon is not new, a bug report was created back in 2013 and is still open. The bug is considered a low priority and non-critical.

A lot of effort has been devoted to improving the capacity of MySQL 8.0 to handle a large number of tables. Until the allocation of reserved extents is fixed, be aware of this issue when planning your storage allocation. Of course, if you are using ZFS, the impacts are more limited…

root@LabPS8_1:~# du -hs /var/lib/mysql/test/t1.ibd
225K /var/lib/mysql/test/t1.ibd
root@LabPS8_1:~# du -hs --apparent-size /var/lib/mysql/test/t1.ibd
9,0M /var/lib/mysql/test/t1.ibd

My lab setup uses ZFS for LXC and KVM instances. LZ4 compression does magic on extents full of “0”, the actual consumed space is reduced to 225KB. If you want to explore the use of ZFS with MySQL, I invite you to read a ZFS post I wrote a few years ago.

Ok, time now to close this parenthesis and go back to the storage of BLOB/TEXT columns. That will be for a future post though!

May
04
2021
--

SAP CEO Christian Klein looks back on his first year

SAP CEO Christian Klein was appointed co-CEO with Jennifer Morgan last April just as the pandemic was hitting full force across the world. Within six months, Morgan was gone and he was sole CEO, put in charge of a storied company at 38 years old. By October, its stock price was down and revenue projections for the coming years were flat.

That is definitely not the way any CEO wants to start their tenure, but the pandemic forced Klein to make some decisions to move his customers to the cloud faster. That, in turn, had an impact on revenue until the transition was completed. While it makes sense to make this move now, investors weren’t happy with the news.

There was also the decision to spin out Qualtrics, the company his predecessor acquired for $8 billion in 2018. As he looked back on the one-year mark, Klein sat down with me to discuss all that has happened and the unique set of challenges he faced.

Just a pandemic, no biggie

Starting in the same month that a worldwide pandemic blows up presents unique challenges for a new leader. For starters, Klein couldn’t visit anyone in person and get to know the team. Instead, he went straight to Zoom and needed to make sure everything was still running.

The CEO says that the company kept chugging along in spite of the disruption. “When I took over this new role, I of course had some concerns about how to support 400,000 customers. After one year, I’ve been astonished. Our support centers are running without disruption and we are proud of that and continue to deliver value,” he said.

Taking over when he couldn’t meet in person with employees or customers has worked out better than he thought. “It was much better than I expected, and of course personally for me, it’s different. I’m the CEO, but I wasn’t able to travel and so I didn’t have the opportunity to go to the U.S., and this is something that I’m looking forward to now, meeting people and talking to them live,” he said.

That’s something he simply wasn’t able to do for his first year because of travel restrictions, so he says communication has been key, something a lot of executives have discussed during COVID. “I’m in regular contact with the employees, and we do it virtually. Still, it’s not the same as when you do it live, but it helps a lot these days. I would say you cannot over-communicate in such times,” he said.

May
04
2021
--

Starboard Value puts Box on notice that it’s looking to take over board

Activist investor Starboard Value is clearly fed up with Box and it let the cloud content management know it in no uncertain terms in a letter published yesterday. The firm, which bought a 7.7% stake in Box two years ago, claims the company is underperforming, executing poorly and making bad business decisions — and it wants to inject the board of directors with new blood.

While they couched the letter in mostly polite language, it’s quite clear Starboard is exasperated with Box. “While we appreciate the dialogue we have had with Box’s management team and Board of Directors (the “Board”) over the past two years, we have grown increasingly frustrated with continued poor results, questionable capital allocation decisions, and subpar shareholder returns,” Starboard wrote in its letter.

Box, as you can imagine, did not take kindly to the shot across its bow and responded in a press release that it has bent over backwards to accommodate Starboard, including refreshing the board last year when they added several members, whom they point out were approved by Starboard.

“Box has a diverse and independent Board with directors who bring extensive technology experience across enterprise and consumer markets, enterprise IT, and global go-to-market strategy, as well as deep financial acumen and proven track records of helping public companies drive disciplined growth, profitability, and stockholder value. Furthermore, seven of the ten directors on the Box Board will have joined the Board within the last three years,” the company wrote in a statement. In other words, Box is saying it already has injected the new blood that Starboard claims it wants.

Box recently got a $500 million cash injection from KKR, widely believed to be an attempt to bulk up cash reserves with the goal of generating growth via acquisition. Starboard was particularly taken aback by this move, however. “The only viable explanation for this financing is a shameless and utterly transparent attempt to “buy the vote” and shows complete disregard for proper corporate governance and fiscal discipline,” Starboard wrote.

Alan Pelz-Sharpe, founder and principal analyst at Deep Analysis, a firm that closely tracks the content management market, says the two sides clearly aren’t aligned, and that’s not likely to change. “Starboard targeted and gained a seat on the board at Box at a difficult time for the firm, that’s the modus operandi for activist investors. Since that time there has clearly been a lot of improvements in terms of Box’s financial goals. However, there is and will remain a misalignment between Starboard’s goals, and Box led by Levie as a whole. Though both would like to see the share price rise, Starboard’s end goal is most likely to see Box acquired, sooner rather than later, and that is not Box’s goal,” he said.

Starboard believes the only way to resolve this situation is to inject the board with still more new blood, taking a swipe at the Box leadership team while it was at it. “There is no good reason that Box should be unable to deliver improved growth and profitability, at least in-line with better performing software companies, which, in turn, would create significant shareholder value,” Starboard wrote.

As such the firm indicated it would be putting up its own slate of board candidates at the company’s next board meeting. In the tit for tat that has been this exchange, Box indicated it would be doing the same.

Meanwhile Box vigorously defended its results. “In the past year, under the oversight of the Operating Committee, the company has made substantial progress across all facets of the business — strategic, operational and financial — as demonstrated by the strong results reported for the full year of fiscal 2021,” the company wrote, pointing to its revenue growth last fiscal year as proof of the progress, with revenue of $771 million up 11% year over year.

It’s unclear how this standoff will play out, but clearly Starboard wants to take over the Board and have its way with Box, believing that it can perform better if it were in charge. That could result ultimately, as Pelz-Sharpe suggested, in Box being acquired.

We would appear to heading for a showdown, and when it’s over, Box could be a very different company, or the current leadership could assert control once and for all and we could proceed with Box’s current growth strategy still in place. Time will tell which is the case.

May
03
2021
--

Changes to Percona Monitoring and Management on AWS Marketplace

Percona Monitoring and Management AWS Marketplace

Percona Monitoring and Management AWS MarketplacePercona Monitoring and Management has been available for single-click deployment from AWS Marketplace for several years now, and we have hundreds of instances concurrently active and growing rapidly due to unparalleled ease of deployment.

Today we’re announcing we are changing pricing for Percona Monitoring and Management on AWS Marketplace. Currently, Percona Monitoring and Management (PMM) is available on AWS Marketplace at no added cost, and effective June 1, 2021, we will add a surcharge equal to 10% of the PMM AWS EC2 Costs.

Why are we making this change?

Making Percona Monitoring and Management available as a one-click deployment on AWS Marketplace is a considerable resource investment, yet, with the current model, only AWS directly benefits from the value which we jointly provide to the users choosing to run PMM on AWS. With the addition of this surcharge, both companies will benefit.

How does this reflect on Percona’s Open Source Commitment?

Percona Monitoring and Management remains a fully Open Source Project.  We’re changing how commercial offerings jointly provided by AWS and Percona will operate.  

I do not want to pay this surcharge, are there free options?

Using Amazon Marketplace is not the only way to deploy PMM on AWS. Many deploy PMM on Amazon EC2 using Docker, and this option continues to require no additional spend other than your infrastructure costs.

What are the benefits of running Percona Monitoring and Management through AWS Marketplace compared to alternative deployment methods?

The main benefit of running Percona Monitoring and Management through the AWS Marketplace is convenience; you can easily change the instance type or add more storage as your PMM load grows. You also have an easy path to high availability with CloudWatch Alarm Actions.

 

Register for Percona Live ONLINE
A Virtual Event about Open Source Databases

 

How does the 10% surcharge compare?

We believe 10% extra for software on top of the infrastructure costs is a very modest charge.  Amazon RDS, for example, has a surcharge starting at 30% to more than 70%, depending on the instance type.

How will I know the exact amount of such a surcharge?

Your bill from AWS will include a separate line item for this charge, in addition to the infrastructure costs consumed by PMM.

What does it mean for Percona Monitoring and Management on AWS Marketplace?

Having a revenue stream that is directly tied to AWS Marketplace deployment will increase the amount of resources we can spend on making Percona Monitoring and Management work with AWS even better. If you’re using PMM with AWS, deploying it through Amazon Marketplace will be a great way to support PMM Development.

Will Percona Monitoring and Management started through AWS Marketplace be entitled to any additional Support options?

No, Percona Monitoring and Management commercial support is available with Percona Support for Open Source Databases.  If you do not have a commercial support subscription, you can get help from the community at the Percona Forums.

What will happen to Percona Monitoring Instances started from AWS Marketplace which are already up and running?

As new pricing goes into effect on June 1st, AWS will give you 90 days’ notice before applying new prices.  If you want to avoid the surcharge, you can move your installation to a Docker-based EC2 install.

What Could AWS Do Better?

It would be great if AWS would develop some sort of affiliate program for Open Source projects, which would allow them to get a share from the value they create for AWS by driving additional infrastructure spend without having to resort to added costs. I believe this would be a win-win for Open Source projects, especially smaller ones, and AWS.

May
03
2021
--

Dell dumps another big asset, moving Boomi to Francisco Partners and TPG for $4B

It’s widely known that Dell has a debt problem left over from its massive acquisition of EMC in 2016, and it seems to be moving this year to eliminate part of it in multi-billion-dollar chunks. The first step was spinning out VMware as a separate company last month, a move expected to net close to $10 billion.

The second step, long expected, finally dropped last night when the company announced it was selling Boomi to a couple of private equity firms for $4 billion. Francisco Partners is joining forces with TPG to make the deal to buy the integration platform.

Boomi is not unlike MuleSoft, a company that Salesforce purchased in 2018 for $6.5 billion, although a bit longer in the tooth. They both help companies with integration problems by creating connections between disparate systems. With so many pieces in place from various acquisitions over the years, it seems like a highly useful asset for Dell to help pull these pieces together and make them work, but the cash is trumping that need.

Providing integration services is a growing requirement as companies look for ways to make better use of data locked in siloed systems. Boomi could help, and that’s one of the primary reasons for the acquisition, according to Francisco executives.

“The ability to integrate and connect data and workflows across any combination of applications or domains is a critical business capability, and we strongly believe that Boomi is well positioned to help companies of all sizes turn data into their most valuable asset,” Francisco CEO Dipanjan Deb and partner Brian Decker said in a statement.

As you would expect, Boomi’s CEO Chris McNabb put a positive spin on the deal about how his new bosses were going to fuel growth for his company. “By partnering with two tier-one investment firms like Francisco Partners and TPG, we can accelerate our ability for our customers to use data to drive competitive advantage. In this next phase of growth, Boomi will be in a position of strength to further advance our innovation and market trajectory while delivering even more value to our customers,” McNabb said in a statement.

All of this may have some truth to it, but the company goes from being part of a large amorphous corporation to getting absorbed in the machinery of two private equity firms. What happens next is hard to say.

The company was founded in 2000, and sold to Dell in 2010. Today, it has 15,000 customer, but Dell’s debt has been well documented, and when you string together a couple of multi-billion-dollar deals as Dell has recently, pretty soon you’re talking real money. While the company has not stated it will explicitly use the proceeds of this deal to pay off debt as it did with the VMware announcement, it stands to reason that this will be the case.

The deal is expected to close later this year, although it will have to pass the typical regulatory scrutiny prior to that.

Apr
30
2021
--

Analytics as a service: Why more enterprises should consider outsourcing

With an increasing number of enterprise systems, growing teams, a rising proliferation of the web and multiple digital initiatives, companies of all sizes are creating loads of data every day. This data contains excellent business insights and immense opportunities, but it has become impossible for companies to derive actionable insights from this data consistently due to its sheer volume.

According to Verified Market Research, the analytics-as-a-service (AaaS) market is expected to grow to $101.29 billion by 2026. Organizations that have not started on their analytics journey or are spending scarce data engineer resources to resolve issues with analytics implementations are not identifying actionable data insights. Through AaaS, managed services providers (MSPs) can help organizations get started on their analytics journey immediately without extravagant capital investment.

MSPs can take ownership of the company’s immediate data analytics needs, resolve ongoing challenges and integrate new data sources to manage dashboard visualizations, reporting and predictive modeling — enabling companies to make data-driven decisions every day.

AaaS could come bundled with multiple business-intelligence-related services. Primarily, the service includes (1) services for data warehouses; (2) services for visualizations and reports; and (3) services for predictive analytics, artificial intelligence (AI) and machine learning (ML). When a company partners with an MSP for analytics as a service, organizations are able to tap into business intelligence easily, instantly and at a lower cost of ownership than doing it in-house. This empowers the enterprise to focus on delivering better customer experiences, be unencumbered with decision-making and build data-driven strategies.

Organizations that have not started on their analytics journey or are spending scarce data engineer resources to resolve issues with analytics implementations are not identifying actionable data insights.

In today’s world, where customers value experiences over transactions, AaaS helps businesses dig deeper into their psyche and tap insights to build long-term winning strategies. It also enables enterprises to forecast and predict business trends by looking at their data and allows employees at every level to make informed decisions.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com