Mar
31
2021
--

Celonis announces significant partnership with IBM to sell its process mining software

Before you can improve a workflow, you have to understand how work advances through a business, which is more complex than you might imagine inside a large enterprise. That’s where Celonis comes in. It uses software to identify how work moves through an organization and suggests more efficient ways of getting the same work done, also known as process mining.

Today, the company announced a significant partnership with IBM where IBM Global Services will train 10,000 consultants worldwide on Celonis. The deal gives Celonis, a company with around 1,200 employees, access to the massive selling and consulting unit, while IBM gets a deep understanding of a piece of technology that is at the front end of the workflow automation trend.

Miguel Milano, chief revenue officer at Celonis, says that digitizing processes has been a trend for several years. It has sped up due to COVID, and it’s partly why the two companies have decided to work together. “Intelligent workflows, or more broadly spoken workflows built to help companies execute better, are at the heart of this partnership and it’s at the heart of this trend now in the market,” Milano said.

The other part of this is that IBM now owns Red Hat, which it acquired in 2018 for $34 billion. The two companies believe that by combining the Celonis technology, which is cloud based, with Red Hat, which can span the hybrid world of on premises and cloud, the two together can provide a much more powerful solution to follow work wherever it happens.

“I do think that moving the [Celonis] software into the Red Hat OpenShift environment is hugely powerful because it does allow in what’s already a very powerful open solution to now operate across this hybrid cloud world, leveraging the power of OpenShift which can straddle the worlds of mainframe, private cloud and public cloud. And data straddle those worlds, and will continue to straddle those worlds,” Mark Foster, senior vice president at IBM Services explained.

You might think that IBM, which acquired robotic process automation vendor WDG Automation last summer, would simply attempt to buy Celonis, but Foster says the partnership is consistent with the company’s attempt to partner with a broader ecosystem.

“I think that this is very much part of an overarching focus of IBM with key ecosystem partners. Some of them are going to be bigger, some of them are going to be smaller, and […] I think this is one where we see the opportunity to connect with an organization that’s taking a leading position in its category, and the opportunity for that to take advantage of the IBM Red Hat technologies…” he said.

The companies had already been working together for some time prior to this formal announcement, and this partnership is the culmination of that. As this firmer commitment to one another goes into effect, the two companies will be working more closely to train thousands of IBM consultants on the technology, while moving the Celonis solution into Red Hat OpenShift in the coming months.

It’s clearly a big deal with the feel of an acquisition, but Milano says that this is about executing his company’s strategy to work with more systems integrators (SIs), and while IBM is a significant partner, it’s not the only one.

“We are becoming an SI consulting-driven organization. So we put consulting companies like IBM at the forefront of our strategy, and this [deal] is a big cornerstone of our strategy,” he said.

Mar
31
2021
--

PingPong is a video chat app for product teams working across multiple time zones

From the earliest days of the pandemic, it was no secret that video chat was about to become a very hot space.

Over the past several months investors have bankrolled a handful of video startups with specific niches, ranging from always-on office surveillance to platforms that encouraged plenty of mini calls to avoid the need for more lengthy team-wide meetings. As the pandemic wanes and plenty of startups begin to look toward hybrid office models, there are others who have decided to lean into embracing a fully remote workforce, a strategy that may require new tools.

PingPong, a recent launch from Y Combinator’s latest batch, is building an asynchronous video chat app for the workplace. We selected PingPong as one of our favorite startups that debuted last week.

The company’s central sell is that for remote teams, there needs to be a better alternative to Slack or email for catching up with co-workers across time zones. While Zoom calls might be able to convey a company’s culture better than a post in a company-wide Slack channel, for fully remote teams operating on different continents, scheduling a company-wide meeting is often a nonstarter.

PingPong is selling its service as an addendum to Slack that helps remote product teams collaborate and convey what they’re working on. Users can capture a short video of themselves and share their screen in lieu of a standup presentation and then they can get caught up on each other’s progress on their own time. PingPong’s hope is that users find more value in brainstorming, conducting design reviews, reporting bugs and more inside while using asynchronous video than they would with text.

“We have a lot to do before we can replace Slack, so right now we kind of emphasize playing nice with Slack,” PingPong CEO Jeff Whitlock tells TechCrunch. “Our longer-term vision is that what young people are doing in their consumer lives, they bring into the enterprise when they graduate into the workforce. You and I were using Instant Messenger all the time in the early 2000s and then we got to the workplace, that was the opportunity for Slack… We believe in the next five or so years, something that’s a richer, more asynchronous video-based Slack alternative will have a lot more interest.”

Building a chat app specifically designed for remote product teams operating in multiple time zones is a tight niche for now, but Whitlock believes that this will become a more common problem as companies embrace the benefits of remote teams post-pandemic. PingPong costs $100 per user per year.

Mar
31
2021
--

Webinar April 14: Optimize and Troubleshoot MySQL Using Percona Monitoring and Management

Troubleshoot MySQL Using Percona Monitoring and Management

Troubleshoot MySQL Using Percona Monitoring and ManagementOptimizing MySQL performance and troubleshooting MySQL problems are two of the most critical and challenging tasks for MySQL DBAs. The databases powering applications need to be able to handle changing traffic workloads while remaining responsive and stable in order to deliver an excellent user experience. Further, DBAs are also expected to find cost-efficient means of solving these issues.

In this webinar, we will demonstrate the advanced options of Percona Monitoring and Management V.2 that enable you to solve these challenges, which are built on free and open-source software. We will look at specific, common MySQL problems and review them.

Please join Peter Zaitsev on Wednesday, April 14th, 2021, at 11 am EDT for his webinar Optimize and Troubleshoot MySQL using Percona Monitoring and Management (PMM).

Register for Webinar

If you can’t attend, sign up anyway, and we’ll send you the slides and recording afterward.

Mar
31
2021
--

MySQL 101: Using super_read_only

MySQL 101 super_read_only

MySQL 101 super_read_onlyAs many of you may remember, Percona added the super_read_only feature way back in Percona Server for MySQL 5.6.21, based on work done by WebScaleSQL. This feature eventually found its way into the Community branch of MySQL starting with 5.7.8, and it works the same in both cases. While this is now old news, over the last year I’ve had a couple of inquiries from clients around super_read_only usage in MySQL, and how it works in practice. While the usage of super_read_only is not complex, there is a small caveat that occasionally leads to some confusion around its use. As such, I thought it may be a good idea to write a quick blog post explaining this feature a bit more, and expanding on how it interacts with read_only.

What is super_read_only?

For those unfamiliar, what is super_read_only? Prior to its introduction, MySQL had the option to set a node to read_only, preventing everyone except those with the SUPER privilege from writing to the database. Most often used for replica nodes, it was a good step in preventing someone from inadvertently updating a replica manually without going through the primary node and letting replication threads handle the distribution. This, of course, could break replication due to duplicate keys, missing rows, or other issues as a result of the inconsistency between the datasets.

Using super_read_only takes this one step further, behaving identically to read_only while also blocking those with SUPER privileges from writing to the database as well. While at first glance this may seem like a stop-gap measure in lieu of better and more restrictive user permissions, it has proven very handy in production environments to add a further layer of protection to replica nodes, and helping to prevent human error from causing unexpected downtime.

One Hand Washes The Other

The inquiries I referenced earlier comes with its use, and not realizing (or forgetting) that read_only and super_read_only are linked. The key thing to keep in mind is that enabling super_read_only implies regular read_only as well.

  • Setting read_only = OFF also sets super_read_only = OFF.
  • Setting super_read_only = ON also sets read_only = ON.
  • All other changes to either of these variables have no effect on the other.

This behavior does seem logical, as when disabling read_only you probably also want to disable super_read_only, and vice-versa, when enabling super_read_only you probably also want to enable read_only. While this linked behavior is documented, there are no warnings or notes in MySQL itself alerting you to this when you make a change to one or the other. This can lead to some head-scratching, as a change to one variable changes the other in lockstep.

Final Considerations

There are a few other implications to keep in mind for read_only that apply to super_read_only as well – for instance, operations on temporary tables are allowed no matter how these variables are set. Any OPTIMIZE TABLE or ANALYZE TABLE operations are also allowed since the purpose of read_only / super_read_only is to prevent changes to the table structure, but not to table metadata such as index statistics. Finally, these settings will need to be manually disabled (or scripted via automation) if there is ever an instance where the replica needs to be promoted to primary status.


Percona Distribution for MySQL: An Enterprise-Grade MySQL Solution, for Free

Mar
31
2021
--

Hex lands $5.5M seed to help data scientists share data across the company

As companies embrace the use of data, hiring more data scientists, a roadblock persists around sharing that data. It requires too much copying and pasting and manual work. Hex, a new startup, wants to change that by providing a way to dispense data across the company in a streamlined and elegant way.

Today, the company announced a $5.5 million seed investment, and also announced that it’s opening up the product from a limited beta to be more widely available. The round was led by Amplify Partners, with help from Box Group, XYZ, Data Community Fund, Operator Collective and a variety of individual investors. The company closed the round last July, but is announcing it for the first time today.

Co-founder and CEO Barry McCardel says that it’s clear that companies are becoming more data-driven and hiring data scientists and analysts at a rapid pace, but there is an issue around data sharing, one that he and his co-founders experienced firsthand when they were working at Palantir.

They decided to develop a purpose-built tool for sharing data with other parts of the organization that are less analytically technical than the data science team working with these data sets. “What we do is we make it very easy for data scientists to connect to their data, analyze and explore it in notebooks. […] And then they can share their work as interactive data apps that anyone else can use,” McCardel explained.

Most data scientists work with their data in online notebooks like Jupyter, where they can build SQL queries and enter Python code to organize it, chart it and so forth. What Hex is doing is creating this super-charged notebook that lets you pull a data set from Snowflake or Amazon Redshift, work with and format the data in an easy way, then drag and drop components from the notebook page — maybe a chart or a data set — and very quickly build a kind of app that you can share with others.

Hex app example with data elements at the top and live graph below it.

Image Credits: Hex

The startup has nine employees, including co-founders McCardel, CTO Caitlin Colgrove and VP of architecture Glen Takahashi. “We’ve really focused on the team front from an early stage, making sure that we’re building a diverse team. And actually today our engineering team is majority female, which is definitely the first time that that’s ever happened to me,” Colgrove said.

She is also part of a small percentage of female founders. A report last year from Silicon Valley Bank found that while the number was heading in the right direction, only 28% of U.S. startups have at least one female founder. That was up from 22% in 2017.

The company was founded in late 2019 and the founders spent a good part of last year building the product and working with design partners. They have a small set of paying customers, and are looking to expand that starting today. While customers still need to work with the Hex team for now to get going, the plan is to make the product self-serve some time later this year.

Hex’s early customers include Glossier, imgur and Pave.

Mar
31
2021
--

Moveworks expands IT chatbot platform to encompass entire organization

When investors gave Moveworks a hefty $75 million Series B at the end of 2019, they were investing in a chatbot startup that to that point had been tuned to answer IT help questions in an automated way. Today, the company announced it had used that money to expand the platform to encompass employee questions across all lines of business.

At the time of that funding, nobody could have anticipated a pandemic either, but throughout last year as companies moved to work from home, having an automated systems in place like Moveworks became even more crucial, says CEO and company co-founder Bhavin Shah.

“It was a tragic year on a variety of fronts, but what it did was it coalesced a lot of energy around people’s need for support, people’s need for speed and help,” Shah said. It helps that employees typically access the Moveworks chatbot inside collaboration tools like Slack or Microsoft Teams, and people have been spending more time in these tools while working at home.

“We definitely saw a lot more interest in the market, and part of that was fueled by the large-scale adoption of collaboration tools like Slack and Microsoft Teams by enterprises around the world,” he said.

The company is working with 100 large enterprise customers today, and those customers were looking for a more automated way for employees to ask questions about a variety of tooling, from HR to finance and facilities management. While Shah says expanding the platform to move beyond IT into other parts of an organization had been on the roadmap, the pandemic definitely underscored the need to expand even more.

While the company spent its first several years tuning the underlying artificial intelligence technology for IT language, they had built it with expansion in mind. “We learned how to build a conversational system so that it can be dynamic and not be predicated on some person’s forethought around [what the question and answer will be] — that approach doesn’t scale. So there were a lot of things around dealing with all these enterprise resources and so forth that really prepared us to be an enterprise-wide partner,” Shah said.

The company also announced a new communications tool that enables companies to use the Moveworks bot to communicate directly with employees to get them to take some action. Shah says companies usually send out an email that, for example, employees have to update their password. The bot tells you it’s time to do that and provides a link to walk you through the process. He says that beta testers have seen a 70% increase in responses using the bot to communicate about an action instead of email.

Shah recognizes that a technology that understands language is going to have a lot of cultural variances and nuances and that requires a diverse team to build a tool like this. He says that his HR team has a set of mandates to make sure they are interviewing people in under-represented roles to build a team that reflects the needs of the customer base and the world at large.

The company has been working with about a dozen customers over the last nine months on the platform expansion, iterating with these customers to improve the quality of the responses, regardless of the type of question or which department it involves. Today, these tools are generally available.

Mar
30
2021
--

How To Automate Dashboard Importing in Percona Monitoring and Management

Automate Dashboard Importing in Percona Monitoring and Management

Automate Dashboard Importing in Percona Monitoring and ManagementIn this blog post, I’ll look at how to import custom dashboards into Percona Monitoring and Management (PMM) 2.x, and give some tips on how to automate it.

The number of dashboards in PMM2 is constantly growing. For example, we recently added a new HAProxy dashboard to the latest 2.15.0 release. Even though the PMM server has more than fifty dashboards, it’s not possible to cover all common server applications.

The greatest source of dashboards is the official Grafana site. Here, anyone can share their own dashboards with the community or find already uploaded ones. Percona has its own account and publishes as-yet-unreleased or unique (non-PMM) dashboards.

Each dashboard has its own number which can be used to refer to it. For example, 12630 is assigned to the dashboard “MySQL Query Performance Troubleshooting”.
Percona Monitoring and Management Dashboard

You can download dashboards as JSON files and import them into your PMM2 installation using the UI.

PMM2
This is easy, but we are forgetting that dashboards can be updated by publishers as new revisions. So it’s possible that the dashboard has a bunch of useful changes that were published after you downloaded it. But, you keep using an old version of the dashboard.

So the only way to use the latest dashboard version is to check the site from time to time. It can really be a pain in the neck, especially if you have to track more than one dashboard.

This is why it’s time to take a look at automation. Grafana has a very powerful API that I used to create this shell script. Let’s take a peek at it. It’s based on the api/dashboards/import API function. The function requires a POST request with a dashboard body.

The first step is to download a dashboard.

curl -s https://grafana.com/api/dashboards/12630/revisions/1/download --output 12630_rev1.json

Note how I used dashboard number 12630 and revision 1 in the command. By increasing the revision number I can find out the latest available dashboard version. This is exactly the approach used in the script.

In the next example, I’ll use a dashboard from our dashboard repository. (I will explain why later.)

curl -L -k https://github.com/percona/grafana-dashboards/raw/PMM-2.0/dashboards/Disk_Details.json --output Disk_Details.json

Now I have a file and can form a POST request to import the dashboard into a PMM installation.

$ curl -s -k -X POST -H "Content-Type: application/json" -d "{\"dashboard\":$(cat Disk_Details.json),\"overwrite\":true}" -u admin:admin https://18.218.63.13/graph/api/dashboards/import


The dashboard has been uploaded. If you take a look at the output you may notice the parameter folderId. With this, it’s possible to specify a Grafana folder for my dashboards.

Here is the command for fetching a list of existing folders.

curl -s -k -u admin:admin https://172.20.0.1/graph/api/folders

I now have folder IDs and can use them in the importing command. The Folder ID should be specified in a POST request as shown in the next example.


Now you are familiar with API import commands, I’ll give you a closer look at community dashboards.

Most of them have the parameter “Data Sources”.
It means that for dashboard importing, you have to specify the data source names assigned by your installation.


This point makes it impossible to import any downloaded dashboards with the API without modifying them. If I execute the import command used earlier (the 12630_rev1.json file downloaded from Grafana.com) I will get an error.


So, here’s another script (cleanup_dash.py) that replaces the datasource fields in dashboards and allows me to pass an importing command. The script takes a dashboard file name as a parameter.


The importing script calls cleanup-dash.py automatically if an initial importing attempt was unsuccessful.

Note the parameters of the importing script. Here you should set the details of your PMM installation. dashboards is an array of dashboards IDs that you want to import into PMM2.

#!/bin/bash
dashboards=(13266 12630 12470)
pmm_server="172.20.0.1"
user_pass="admin:admin"
folderName="General"

Now, you should download both scripts and try to import dashboards. Make sure that both scripts are executable and in the same folder. Here are the commands to do it.

curl -LJOs https://github.com/Percona-Lab/pmm-dashboards/raw/master/misc/import-dashboard-grafana-cloud.sh --output import-dashboard-grafana-cloud.sh
curl -LJOs https://github.com/Percona-Lab/pmm-dashboards/raw/master/misc/cleanup-dash.py --output cleanup-dash.py

chmod a+x import-dashboard-grafana-cloud.sh
chmod a+x cleanup-dash.py

./import-dashboard-grafana-cloud.sh

You can next find the imported dashboards in your PMM installation. They were put into the ‘Insight’ folder and can be found by the keyword ‘PMM2’.

imported PMM dashboards

By default, the script imports all designed for PMM2 dashboards from Percona account. Also, folder names and dashboard IDs can be specified as parameters for the script.

Here are some usage examples:

import-dashboard-grafana-cloud.sh Default list of dashboards will be uploaded into General folder
import-dashboard-grafana-cloud.sh Insight Default list of dashboards will be uploaded into Insight folder
import-dashboard-grafana-cloud.sh 13266 12630 12470 Dashboards 13266 12630 12470 will be uploaded into General folder
import-dashboard-grafana-cloud.sh Insight 13266 12630 12470 Dashboards 13266 12630 12470 will be uploaded into Insight folder

You can define any number of dashboards in the script parameters and run the script periodically to always have the most recent dashboard versions.


Percona Monitoring and Management is free to download and use. Try it today!

Mar
30
2021
--

6sense raises $125M at a $2.1B valuation for its ‘ID graph’, an AI-based predictive sales and marketing platform

AI has become a fundamental cornerstone of how tech companies are building tools for salespeople: they are useful for supercharging (and complementing) the abilities of talented humans, or helping them keep themselves significantly more organised; even if in some cases — as with chatbots — they are replacing them altogether. In the latest development, 6sense, one of the pioneers in using AI to boost the sales and marketing experience, is announcing a major round of funding that underscores the traction AI tools are seeing in the sales realm.

The startup has raised $125 million at a valuation of $2.1 billion, a Series D being led by D1 Capital Partners, with Sapphire Ventures, Tiger Global and previous backer Insight Partners also participating.

The company plans to use the funding to expand its platform and its predictive capabilities across a wider range of sources.

For some context, this is a huge jump for the company compared to its last fundraise: at the end of 2019, when it raised $40 million, it was valued at a mere $300 million, according to data from PitchBook.

But it’s not a big surprise: at a time when a lot of companies are going through “digital transformation” and investing in better tools for their employees to work more efficiently remotely (especially important for sales people who might have previously worked together in physical teams), 6sense is on track for its fourth year of more than 100% growth, adding 100 new customers in the fourth quarter alone. It caters to small, medium, and large businesses, and some of its customers include Dell, Mediafly, Sage and SocialChorus.

The company’s approach speaks to a classic problem that AI tools are often tasked with solving: the data that sales people need to use and keep up to date on customer accounts, and critically targets, lives in a number of different silos — they can include CRM systems, or large databases outside of the company, or signals on social media.

While some tools are being built to handle all of that from the ground up, 6sense takes a different approach, providing a way of ingesting and utilizing all of it to get a complete picture of a company and the individuals a salesperson might want to target within it. It takes into account some of the harder nuts to crack in the market, such as how to track “anonymous buying behavior” to a more concrete customer name; how to prioritizes accounts according to those most likely to buy; and planning for multi-channel campaigns.

6sense has patented the technology it uses to achieve this and calls its approach building an “ID graph.” (Which you can think of as the sales equivalent of the social graph of Facebook, or the knowledge graph that LinkedIn has aimed to build mapping skills and jobs globally.) The key with 6sense is that it is building a set of tools that not just sales people can use, but marketers too — useful since the two sit much closer together at companies these days.

Jason Zintak, the company’s CEO (who worked for many years as a salesperson himself, so gets the pain points very well), referred to the approach and concept behind 6sense as “revtech”: aimed at organizations in the business whose work generates revenue for the company.

“Our AI is focused on signal, identifying companies that are in the market to buy something,” said Zintak in an interview. “Once you have that you can sell to them.”

That focus and traction with customers is one reason investors are interested.

“Customer conversations are a critical part of our due diligence process, and the feedback from 6sense customers is among the best we’ve heard,” said Dan Sundheim, founder and chief investment officer at D1 Capital Partners, in a statement. “Improving revenue results is a goal for every business, but it’s easier said than done. The way 6sense consistently creates value for customers made it clear that they deliver a unique, must-have solution for B2B revenue teams.”

Teddie Wardi at Insight highlights that AI and the predictive elements of 6sense’s technology — which have been a consistent part of the product since it was founded — are what help it stand out.

“AI generally is a buzzword, but here it is a key part of the solution, the brand behind the platform,” he said in an interview. “Instead of having massive funnels, 6sense switches the whole thing around. Catching the right person at the right time and in the right context make sales and marketing more effective. And the AI piece is what really powers it. It uses signals to construct the buyer journey and tell the sales person when it is the right time to engage.”

Mar
30
2021
--

What’s Running in My DB? A Journey with currentOp() in MongoDB

currentOp() in MongoDB

currentOp() in MongoDBI have been working a while with customers, supporting both MongoDB and MySQL technologies. Most of the time when an issue arises, the customers working with MySQL collect most of the information happening in the DB server, including all the queries running that particular time, using “show full processlist;” This information would help us to look at the problem, like which queries are taking the time and where it was spending the time. 

But for MongoDB, most of the time we don’t receive this (in-progress operations) information. And we had to check with long queries logged into the MongoDB log file and, of course, it writes most of the things like planSummary (whether it used the index or not), documents/index scanned, time to complete, etc. It’s like doing a postmortem rather than checking the issue happening in real-time. Actually collecting the information about operations taking the time or finding a problematic query while the issue is happening could help you find the right one to kill (to release the pressure) or check the situation of the database. 

The in-progress operations in MongoDB can be checked via the database command currentOp(). The level of information can be controlled via the options passed through it. Most of the time, the output from it is not that interesting to check because it contains a lot of information, making it difficult to spot the ones we need. However, MongoDB knows this and has included many options to filter the operations using currentOp over multiple versions easily. Some of the information regarding this is mentioned in the below release notes:

https://docs.mongodb.com/manual/release-notes/3.6/#new-aggregation-stages 

https://docs.mongodb.com/manual/release-notes/4.0/#id26

https://docs.mongodb.com/manual/release-notes/4.2/#currentop 

In this blog, I will share some tricks to work with this command and fetch the operations that we need to check. This would help a person check the ongoing operations and if necessary, kill the problematic command – if they wish.

Introduction

The database command ` provides information about the ongoing/currently running operations in the database. It must be run against the admin database. On servers that run with authorization, you need the inprog privilege action to view operations for all users. This is included in the built-in clusterMonitor role.

Use Cases

The command to see all the active connections:

db.currentOp()

The user that has no inprog privilege can view its own operations, without this privilege, with the below command:

db.currentOp( { "$ownOps": true } )

To see the connections in the background, and idle connections, you can use either one of the below commands:

db.currentOp(true)
db.currentOp( { "$all": true } )

As I said before, you can use filters here to check the operations you need, like a command running for more than a few seconds, waiting for a lock, active/inactive connections, running on a particular namespace, etc. Let’s see some examples from my test environment.

The below command provides information about all active connections. 

mongos> db.currentOp()
{
	"inprog" : [
		{
			"shard" : "shard01",
			"host" : "bm-support01.bm.int.percona.com:54012",
			"desc" : "conn52",
			"connectionId" : 52,
			"client_s" : "127.0.0.1:53338",
			"appName" : "MongoDB Shell",
			"clientMetadata" : {
				"application" : {
					"name" : "MongoDB Shell"
				},
				"driver" : {
					"name" : "MongoDB Internal Client",
					"version" : "4.0.19-12"
				},
				"os" : {
					"type" : "Linux",
					"name" : "CentOS Linux release 7.9.2009 (Core)",
					"architecture" : "x86_64",
					"version" : "Kernel 5.10.13-1.el7.elrepo.x86_64"
				},
				"mongos" : {
					"host" : "bm-support01.bm.int.percona.com:54010",
					"client" : "127.0.0.1:36018",
					"version" : "4.0.19-12"
				}
			},
			"active" : true,
			"currentOpTime" : "2021-03-21T23:41:48.206-0400",
			"opid" : "shard01:1404",
			"lsid" : {
				"id" : UUID("6bd7549b-0c89-40b5-b59f-af765199bbcf"),
				"uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=")
			},
			"secs_running" : NumberLong(0),
			"microsecs_running" : NumberLong(180),
			"op" : "getmore",
			"ns" : "admin.$cmd",
			"command" : {
				"getMore" : NumberLong("8620961729688473960"),
				"collection" : "$cmd.aggregate",
				"batchSize" : NumberLong(101),
				"lsid" : {
					"id" : UUID("6bd7549b-0c89-40b5-b59f-af765199bbcf"),
					"uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=")
				},
				"$clusterTime" : {
					"clusterTime" : Timestamp(1616384506, 2),
					"signature" : {
						"hash" : BinData(0,"z/r5Z/DxrxaeH1VIKOzeok06YxY="),
						"keyId" : NumberLong("6942317981145759774")
					}
				},
				"$client" : {
					"application" : {
						"name" : "MongoDB Shell"
					},
					"driver" : {
						"name" : "MongoDB Internal Client",
						"version" : "4.0.19-12"
					},
					"os" : {
						"type" : "Linux",
						"name" : "CentOS Linux release 7.9.2009 (Core)",
						"architecture" : "x86_64",
						"version" : "Kernel 5.10.13-1.el7.elrepo.x86_64"
					},
					"mongos" : {
						"host" : "bm-support01.bm.int.percona.com:54010",
						"client" : "127.0.0.1:36018",
						"version" : "4.0.19-12"
					}
				},
				"$configServerState" : {
					"opTime" : {
						"ts" : Timestamp(1616384506, 2),
						"t" : NumberLong(1)
					}
				},
				"$db" : "admin"
			},
			"originatingCommand" : {
				"aggregate" : 1,
				"pipeline" : [
					{
						"$currentOp" : {
							"allUsers" : true,
							"truncateOps" : true
						}
					},
					{
						"$sort" : {
							"shard" : 1
						}
					}
				],
				"fromMongos" : true,
				"needsMerge" : true,
				"mergeByPBRT" : false,
				"cursor" : {
					"batchSize" : 0
				},
				"allowImplicitCollectionCreation" : true,
				"lsid" : {
					"id" : UUID("6bd7549b-0c89-40b5-b59f-af765199bbcf"),
					"uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=")
				},
				"$clusterTime" : {
					"clusterTime" : Timestamp(1616384506, 2),
					"signature" : {
						"hash" : BinData(0,"z/r5Z/DxrxaeH1VIKOzeok06YxY="),
						"keyId" : NumberLong("6942317981145759774")
					}
				},
				"$client" : {
					"application" : {
						"name" : "MongoDB Shell"
					},
					"driver" : {
						"name" : "MongoDB Internal Client",
						"version" : "4.0.19-12"
					},
					"os" : {
						"type" : "Linux",
						"name" : "CentOS Linux release 7.9.2009 (Core)",
						"architecture" : "x86_64",
						"version" : "Kernel 5.10.13-1.el7.elrepo.x86_64"
					},
					"mongos" : {
						"host" : "bm-support01.bm.int.percona.com:54010",
						"client" : "127.0.0.1:36018",
						"version" : "4.0.19-12"
					}
				},
				"$configServerState" : {
					"opTime" : {
						"ts" : Timestamp(1616384506, 2),
						"t" : NumberLong(1)
					}
				},
				"$db" : "admin"
			},
			"numYields" : 0,
			"locks" : {
				
			},
			"waitingForLock" : false,
			"lockStats" : {
				
			}
		},
		{
			"shard" : "shard01",
			"host" : "bm-support01.bm.int.percona.com:54012",
			"desc" : "monitoring keys for HMAC",
…
...

Some of the important parameters that we may need to focus on from the output are as follows. I provide this information here as we will use these parameters to filter for the operations that we need.

PARAMETER DESCRIPTION
host The host that the operation is running
opid The operation id (it is used to kill that operation) 
active The connection’s status. True if it is running and false if it is idle
client Host/IP information about where the operation originated
clientMetadata Provides more information about client connection
shard Which shard is connected if it is sharded cluster environment
appName Information about the type of client
currentOpTime Start time of the operation
ns Namespace (details about the DB and collection)
command A document with the full command object associated with the operation
secs_running / microsecs_running How many seconds/microseconds that the particular operation is running
op Operation type like insert, update, find, delete etc
planSummary Whether the command uses the index IXSCAN or collection scan COLLSCAN (disk read)
cursor Cursor information for getmore operations
locks Type and mode of the lock. See here for more details
waitingForLock True if the operation waiting for a lock, false if it has required lock
msg A message that describes the status and progress of the operation
killPending Whether the operation is currently flagged for termination
numYields Is a counter that reports the number of times the operation has yielded to allow other operation

The raw currentOp output can be processed by the javascript forEach function method in the mongo shell, so we can use it to do many operations. For example, I want to take counts of the output or number of active connections. Then I can use the below one:

mongos> var c=1;
mongos> db.currentOp().inprog.forEach(
... function(doc){
...   c=c+1
... }
... )
mongos> print("The total number of active connections are: "+c)
The total number of active connections are: 16

To find the number of active and inactive connections:

mongos> var active=1; var inactive=1;
mongos> db.currentOp(true).inprog.forEach( function(doc){  if(doc.active){    active=active+1 }  else if(!doc.active){    inactive=inactive+1 }  } )
mongos> print("The number of active connections are: "+active+"\nThe number of inactive connections are: "+inactive)
The number of active connections are: 16
The number of inactive connections are: 118

To find the operations running (importing job) more than 1000 microseconds (for seconds, use secs_running) and with a specific namespace vinodh.testColl:

mongos> db.currentOp(true).inprog.forEach( function(doc){ if(doc.microsecs_running>1000 && doc.ns == "vinodh.testColl")  {print("\nop: "+doc.op+", namespace: "+doc.ns+", \ncommand: ");printjson(doc.command)} } )

op: insert, namespace: vinodh.testColl, 
command: 
{
  "$truncated" : "{ insert: \"testColl\", bypassDocumentValidation: false, ordered: false, documents: [ { _id: ObjectId('605a1ab05c15f7d2046d5d26'), id: 49004, name: \"Vernon Drake\", age: 19, emails: [ \"fetome@liek.gh\", \"noddo@ve.kh\", \"wunin@cu.ci\" ], born_in: \"1973\", ip_addresses: [ \"212.199.110.72\" ], blob: BinData(0, 4736735553674F6E6825) }, { _id: ObjectId('605a1ab05c15f7d2046d5d27'), id: 49003, name: \"Rhoda Burke\", age: 64, emails: [ \"zog@borvelaj.pa\", \"hoz@ni.do\", \"abfad@borup.cl\" ], born_in: \"1976\", ip_addresses: [ \"12.190.161.2\", \"16.63.87.211\" ], blob: BinData(0, 244C586A683244744F54) }, { _id: ObjectId('605a1ab05c15f7d2046d5d28'), id: 49002, name: \"Alberta Mack\", age: 25, emails: [ \"sibef@nuvaki.sn\", \"erusu@dimpu.ag\", \"miumurup@se.ir\" ], born_in: \"1971\", ip_addresses: [ \"250.239.181.203\", \"192.240.119.122\", \"196.13.33.240\" ], blob: BinData(0, 7A63566B42732659236D) }, { _id: ObjectId('605a1ab05c15f7d2046d5d29'), id: 49005, name: \"Minnie Chapman\", age: 33, emails: [ \"jirgenor@esevepu.edu\", \"jo@m..."
}

But this command can be easily written without forEach as follows directly as well:

mongos> db.currentOp({ "active": true, "microsecs_running": {$gt: 1000}, "ns": /^vinodh.testColl/ })
{
  "inprog" : [
    {
      "shard" : "shard01",
      "host" : "bm-support01.bm.int.percona.com:54012",
      "desc" : "conn268",
      "connectionId" : 268,
      "client_s" : "127.0.0.1:55480",
      "active" : true,
      "currentOpTime" : "2021-03-23T13:05:32.550-0400",
      "opid" : "shard01:689582",
      "secs_running" : NumberLong(0),
      "microsecs_running" : NumberLong(44996),
      "op" : "insert",
      "ns" : "vinodh.testColl",
      "command" : {
        "$truncated" : "{ insert: \"testColl\", bypassDocumentValidation: false, ordered: false, documents: [ { _id: ObjectId('605a1fdc5c15f7d2047ee04e'), id: 16002, name: \"Linnie Walsh\", age: 25, emails: [ \"evoludecu@logejvi.ai\", \"ilahubfep@ud.mc\", \"siujo@pipazvo.ht\" ], born_in: \"1982\", ip_addresses: [ \"198.117.218.117\" ], blob: BinData(0, 244A6E702A5047405149) }, { _id: ObjectId('605a1fdc5c15f7d2047ee04f'), id: 16004, name: \"Larry Watts\", age: 47, emails: [ \"sa@hulub.gy\", \"wepo@ruvnuhej.om\", \"jorvohki@nobajmo.hr\" ], born_in: \"1989\", ip_addresses: [], blob: BinData(0, 50507461366B6F766C40) }, { _id: ObjectId('605a1fdc5c15f7d2047ee050'), id: 16003, name: \"Alejandro Jacobs\", age: 61, emails: [ \"enijaze@hihen.et\", \"gekesaco@kockod.fk\", \"rohovus@il.az\" ], born_in: \"1988\", ip_addresses: [ \"239.139.123.44\", \"168.34.26.236\", \"123.230.33.251\", \"132.222.43.251\" ], blob: BinData(0, 32213574705938385077) }, { _id: ObjectId('605a1fdc5c15f7d2047ee051'), id: 16005, name: \"Mildred French\", age: 20, emails: [ \"totfi@su.mn\"..."
      },
      "numYields" : 0,
      "locks" : {
        
      },
      "waitingForLock" : false,
      "lockStats" : {
        "Global" : {
          "acquireCount" : {
            "r" : NumberLong(16),
            "w" : NumberLong(16)
          }
        },
        "Database" : {
          "acquireCount" : {
            "w" : NumberLong(16)
          }
…

The operations waiting for the lock on a specific namespace (ns) / operation (op) can be filtered as follows, and you can alter the parameters to filter as you wish:

db.currentOp(
   {
     "waitingForLock" : true,
    "ns": /^vinodh.testColl/,
     $or: [
        { "op" : { "$in" : [ "insert", "update", "remove" ] } },
        { "command.findandmodify": { $exists: true } }
    ]
   }
)

Aggregate – currentOp():

Starting with MongoDB 3.6, currentOp method is supported in aggregation. So checking the currentOp is even easier with this method. Also, the aggregation pipeline doesn’t have a 16MB result size limit as well. The usage is:

{ $currentOp: { allUsers: <boolean>, idleConnections: <boolean>, idleCursors: <boolean>, idleSessions: <boolean>, localOps: <boolean> } }

Note:

Options/Features added, version-wise, to currentOp()

  • allUsers, idleConnections – available from 3.6,
  • idleCursors – available from 4.2
  • idleSessions, localOps – available from 4.0

Let’s see an example of the same. Count all connections including idle connections with shard02:

mongos> db.aggregate( [ { $currentOp : { allUsers: true, idleConnections: true } },    
... { $match : { shard: "shard02" }}, {$group: {_id:"shard02", count: {$sum: 1}} } ] )
{ "_id" : "shard02", "count" : 65 }

Now using the same import job, finding the operation as follows:

mongos> db.aggregate( [    { $currentOp : { allUsers: true, idleConnections: false } },    
... { $match : { "ns": "vinodh.testColl" }} ] )
{ "shard" : "shard01", "host" : "bm-support01.bm.int.percona.com:54012", "desc" : "conn279", "connectionId" : 279, "client_s" : "127.0.0.1:38564", "active" : true, "currentOpTime" : "2021-03-23T13:33:57.225-0400", "opid" : "shard01:722388", "secs_running" : NumberLong(0), "microsecs_running" : NumberLong(24668), "op" : "insert", "ns" : "vinodh.testColl", "command" : { "insert" : "testColl", "bypassDocumentValidation" : false, "ordered" : false, "documents" : [ { "_id" : ObjectId("605a26855c15f7d20484d217"), "id" : 12020, "name" : "Dora Watson",....tId("000000000000000000000000") ], "writeConcern" : { "getLastError" : 1, "w" : "majority" }, "allowImplicitCollectionCreation" : false, "$clusterTime" : { "clusterTime" : Timestamp(1616520837, 1000), "signature" : { "hash" : BinData(0,"yze8dSs12MUKlnb7rpw5h2YblFI="), "keyId" : NumberLong("6942317981145759774") } }, "$configServerState" : { "opTime" : { "ts" : Timestamp(1616520835, 10), "t" : NumberLong(2) } }, "$db" : "vinodh" }, "numYields" : 0, "locks" : { "Global" : "w", "Database" : "w", "Collection" : "w" }, "waitingForLock" : false, "lockStats" : { "Global" : { "acquireCount" : { "r" : NumberLong(8), "w" : NumberLong(8) } }, "Database" : { "acquireCount" : { "w" : NumberLong(8) } }, "Collection" : { "acquireCount" : { "w" : NumberLong(8) } } } }

To reduce the output and project only some fields in the output:

mongos> db.aggregate( [    
... { $currentOp : { allUsers: true, idleConnections: false } },    
... { $match : { ns: "vinodh.testColl", microsecs_running: {$gt: 10000} }}, 
... {$project: { _id:0, host:1, opid:1, secs_running: 1, op:1, ns:1, waitingForLock: 1, numYields: 1  } } ] )
{ "host" : "bm-support01.bm.int.percona.com:54012", "opid" : "shard01:777387", "secs_running" : NumberLong(0), "op" : "insert", "ns" : "vinodh.testColl", "numYields" : 0, "waitingForLock" : false }

To see the output in fantasy mode, used to be pretty ?

mongos> db.aggregate( [    { $currentOp : { allUsers: true, idleConnections: false } },    { $match : { ns: "vinodh.testColl", microsecs_running: {$gt: 10000} }}, {$project: { _id:0, host:1, opid:1, secs_running: 1, op:1, ns:1, waitingForLock: 1, numYields: 1  } } ] ).pretty()
{
	"host" : "bm-support01.bm.int.percona.com:54012",
	"opid" : "shard01:801285",
	"secs_running" : NumberLong(0),
	"op" : "insert",
	"ns" : "vinodh.testColl",
	"numYields" : 0,
	"waitingForLock" : false
}

I hope now you will have some idea on using currentOp() to check the ongoing operations. 

Let’s imagine you want to kill an operation running for a long time. From the same currentOp document you identified it with, you can take the opid and kill it using killOp() method. In the example below, I used the sharded environment and so the opid is in a “shard_no:opid” format. See here for more details.

mongos> db.aggregate( [    { $currentOp : { allUsers: true, idleConnections: false } },    { $match : { ns: "vinodh.testColl" }}, {$project: { _id:0, host:1, opid:1, microsecs_running: 1, op:1, ns:1, waitingForLock: 1, numYields: 1  } } ] ).pretty()
{
	"host" : "bm-support01.bm.int.percona.com:54012",
	"opid" : "shard01:1355440",
	"microsecs_running" : NumberLong(39200),
	"op" : "insert",
	"ns" : "vinodh.testColl",
	"numYields" : 0,
	"waitingForLock" : false
}


mongos> db.killOp("shard01:1355440")
{
	"shard" : "shard01",
	"shardid" : 1355440,
	"ok" : 1,
	"operationTime" : Timestamp(1616525284, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1616525284, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

Conclusion

So the next time when you want to check the ongoing operations, you can use these techniques for filtering operations waiting for a lock, running on a namespace, running more than a specified time, specific operation or specific shard, etc. Also, comment here if you have any other ideas on this topic. I am happy to learn/see that as well.


Percona Distribution for MongoDB is the only truly open-source solution powerful enough for enterprise applications.

It’s free to use, so try it today!

Mar
30
2021
--

HYCU raises $87.5M to take on Rubrik and the rest in multi-cloud data backup and recovery

As more companies become ever more reliant on digital infrastructure for everyday work, the more they become major targets for malicious hackers — both trends accelerated by the pandemic — and that is leading to an ever-greater need for IT and security departments to find ways of protecting data should it become compromised. Today, one of the companies that has emerged as a strong player in data backup and recovery is announcing its first major round of funding.

HYCU, which provides multi-cloud backup and recovery services for mid-market and enterprise customers, has raised $87.5 million, a Series A that it the Boston-based startup will be using to invest in building out its platform further, to bring its services into more markets, and to hire 100 more people.

HYCU’s premise and ambition, CEO and founder Simon Taylor said in an interview, is to provide backup and storage services that are as simple to use “as backing up in iCloud for consumers.”

“If you look at primary storage, it’s become very SaaS-ifed, with no professional services required,” he continued. “But backup has stayed very legacy. It’s still mostly focused on one specific environment and can’t perform well when multi-cloud is being used.”

And HYCU’s name fits with that ethos. It is pronounced “haiku”, which Taylor told me refers not just to that Japanese poetic form that looks simple but hides a lot of meaning, but also “hybrid cloud uptime.”

The company is probably known best for its integration with Nutanix, but has over time expanded to serve enterprises building and operating IT and apps over VMware, Google Cloud, Azure and AWS. The company also has built a tool to help migrate data for enterprises, HYCU Protégé, which will also be expanded.

The funding is being led by Bain Capital Ventures, with participation also from Acrew Capital (which was also in the news last week as an investor in the $118 million round for Pie Insurance). The valuation is not being disclosed.

This is the first major outside funding that the company has announced since being founded in 2018, but in that time it has grown into a sizeable competitor against others like Rubrik, Veeam, Veritas and CommVault. The Rubrik comparison is interesting, given that it is also backed by Bain (which led a $261 million round in Rubrik in 2019). HYCU now has more than 2,000 customers in 75 countries. Taylor says that not taking funding while growing into what it has become meant that it was “listening and closer to the needs of our customers,” rather than spending more time paying attention to what investors says.

Now that it’s reached a certain scale, though, things appear to be shifting and there will probably be more money down the line. “This is just round one for us,” Taylor said.

He added that this funding came in the wake of a lot of inbound interest that included not just the usual range of VCs and private equity firms that are getting more involved in VC, but also, it turns out, SPACs, which as they grow in number, seem to be exploring what kinds and stages of companies they tap with their quick finance-and-go-public model.

And although HYCU hadn’t been proactively pitching investors for funding, it would have been on their radars. In fact, Bain is a major backer of Nutanix, putting some $750 million into the company last August. There is some strategic sense in supporting businesses that figure strongly in the infrastructure of your other portfolio companies.

There is another important reason for HYCU raising capital to expand beyond what its balance sheet could provide to fuel growth: HYCU’s would-be competition is itself going through a moment of investment and expansion. For example, Veeam, which was acquired by Insight last January for $5 billion, then proceeded to acquire Kasten to move into serving enterprises that used Kubernetes-native workloads across on-premises and cloud environments. And Rubrik last year acquired Igneous to bring management of unstructured data into its purview. And it’s not a given that just because this is a sector seeing a lot of demand, that it’s all smooth sailing. Igneous was on the rocks at the time of its deal, and Rubrik itself had a data leak in 2019, highlighting that even those who are expert in protecting data can run up against problems.

Taylor notes that ransomware indeed remains a very persistent problem for its customers — reflecting what others in the security world have observed — and its approach for now is to remain focused on how it delivers services in an agent-less environment. “We integrate into the platform,” he said. “That is incredibly important. It means that you can be up and running immediately, with no need for professional services to do the integrating, and we also make it a lot harder for criminals because of this.”

Longer term, it will keep its focus on backup and recovery with no immediate plans to move into adjacent areas though such as more security services or other tools. “We’re not trying to be a Veritas and own the entire business end-to-end,” Taylor said. “The goal is to make sure the IT department has visibility and the cloud journey is protected.”

Enrique Salem, a partner at Bain Capital Ventures and the former CEO of Symantec, is joining HYCU’s board with this round and sees the opportunity in the market for a product like HYCU’s.

“We are in the early days of a multi-decade shift to the public cloud, but existing on-premises backup vendors are poorly equipped to enable this transition, creating tremendous opportunity for a new category of cloud-native backup providers,” he said in a statement. “As one of the early players in multi-cloud backup as a service bringing true SaaS to both on-premises and cloud-native environments, HYCU is a clear leader in a space that will continue to create large multi-billion dollar companies.”

Stefan Cohen, a principal at Bain Capital Ventures, will also be joining the board.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com