Jun
24
2020
--

Lightrun raises $4M for its continuous debugging and observability platform

Lightrun, a Tel Aviv-based startup that makes it easier for developers to debug their production code, today announced that it has raised a $4 million seed round led by Glilot Capital Partners, with participation from a number of engineering executives from several Fortune 500 firms.

The company was co-founded by Ilan Peleg (who, in a previous life, was a competitive 800m runner) and Leonid Blouvshtein, with Peleg taking the CEO role and Blouvshtein the CTO position.

The overall idea behind Lightrun is that it’s too hard for developers to debug their production code. “In today’s world, whenever a developer issues a new software version and deploys it into production, the only way to understand the application’s behavior is based on log lines or metrics which were defined during the development stage,” Peleg explained. “The thing is, that is simply not enough. We’ve all encountered cases of missing a very specific log line when trying to troubleshoot production issues, then having to release a new hotfix version in order to add this specific logline, or — alternatively — reproduce the bug locally to better understand the application’s behavior.”

Image Credits: Lightrun

With Lightrun, as the co-founders showed me in a demo, developers can easily add new logs and metrics to their code from their IDE and then receive real-time data from their real production or development environments. For that to work, they need to have the Lightrun agent installed, but the overhead here is generally low because the agent sits idle until it is needed. In the IDE, the experience isn’t all that different from setting a traditional breakpoint in a debugger — only that there is no break. Lightrun can also use existing logging tools like Datadog to pipe its logging data to them.

While the service’s agent is agnostic about the environment it runs in, the company currently only supports JVM languages. Blouvshtein noted that building JVM language support was likely harder than building support for other languages and the company plans to launch support for more languages in the future.

“We make a point of investing in technologies that transform big industries,” said Kobi Samboursky, founder and managing partner at Glilot Capital Partners . “Lightrun is spearheading Continuous Debugging and Continuous Observability, picking up where CI/CD ends, turning observability into a real-time process instead of the iterative process it is today. We’re confident that this will become DevOps and development best practices, enabling I&O leaders to react faster to production issues.”

For now, there is still a bit of an onboarding process to get started with Lightrun, though that’s generally a very short process, the team tells me. Over time, the company plans to make this a self-service process. At that point, Lightrun will likely also become more interesting to smaller teams and individual developers, though the company is mostly focused on enterprise users and, despite only really launching out of stealth today and offering limited language support, the company already has a number of paying customers, including major enterprises.

“Our strategy is based on two approaches: bottom-up and top-down. Bottom-up, we’re targeting developers, they are the end-users and we want to ensure they get a quality product they can trust to help them. We put a lot of effort into reaching out through the developer channels and communities, as well as enabling usage and getting feedback. […] Top-down approach, we are approaching R&D management like VP of R&D, R&D directors in bigger companies and then we show them how Lightrun saves company development resources and improves customer satisfaction.”

Unsurprisingly, the company, which currently has about a dozen employees, plans to use the new funding to add support for more languages and improve its service with new features, including support for tracing.

Mar
31
2020
--

Amid shift to remote work, application performance monitoring is IT’s big moment

In recent weeks, millions have started working from home, putting unheard-of pressure on services like video conferencing, online learning, food delivery and e-commerce platforms. While some verticals have seen a marked reduction in traffic, others are being asked to scale to new heights.

Services that were previously nice to have are now necessities, but how do organizations track pressure points that can add up to a critical failure? There is actually a whole class of software to help in this regard.

Monitoring tools like Datadog, New Relic and Elastic are designed to help companies understand what’s happening inside their key systems and warn them when things may be going sideways. That’s absolutely essential as these services are being asked to handle unprecedented levels of activity.

At a time when performance is critical, application performance monitoring (APM) tools are helping companies stay up and running. They also help track root causes should the worst case happen and they go down, with the goal of getting going again as quickly as possible.

We spoke to a few monitoring vendor CEOs to understand better how they are helping customers navigate this demand and keep systems up and running when we need them most.

IT’s big moment

Jan
16
2020
--

Epsagon scores $16M Series A to monitor modern development environments

Epsagon, an Israeli startup that wants to help monitor modern development environments like serverless and containers, announced a $16 million Series A today.

U.S. Venture Partners (USVP), a new investor, led the round. Previous investors Lightspeed Venture Partners and StageOne Ventures also participated. Today’s investment brings the total raised to $20 million, according to the company.

CEO and co-founder Nitzan Shapira says that the company has been expanding its product offerings in the last year to cover not just its serverless roots, but also provide deeper insights into a number of forms of modern development.

“So we spoke around May when we launched our platform for microservices in the cloud products, and that includes containers, serverless and really any kind of workload to build microservices apps. Since then we have had a few significant announcements,” Shapira told TechCrunch.

For starters, the company announced support for tracing and metrics for Kubernetes workloads, including native Kubernetes, along with managed Kubernetes services like AWS EKS and Google GKE. “A few months ago, we announced our Kubernetes integration. So, if you’re running any Kubernetes workload, you can integrate with Epsagon in one click, and from there you get all the metrics out of the box, then you can set up a tracing in a matter of minutes. So that opens up a very big number of use cases for us,” he said.

The company also announced support for AWS AppSync, a no-code programming tool on the Amazon cloud platform. “We are the only provider today to introduce tracing for AppSync and that’s [an area] where people really struggle with the monitoring and troubleshooting of it,” he said.

The company hopes to use the money from today’s investment to expand the product offering further with support for Microsoft Azure and Google Cloud Platform in the coming year. He also wants to expand the automation of some tasks that have to be manually configured today.

“Our intention is to make the product as automated as possible, so the user will get an amazing experience in a matter of minutes, including advanced monitoring, identifying different problems and troubleshooting,” he said

Shapira says the company has around 25 employees today, and plans to double headcount in the next year.

Dec
31
2019
--

InsightFinder gets a $2M seed to automate outage prevention

InsightFinder, a startup from North Carolina based on 15 years of academic research, wants to bring machine learning to system monitoring to automatically identify and fix common issues. Today, the company announced a $2 million seed round.

?IDEA Fund Partners, a VC out of Durham, N.C.,? led the round, with participation from ?Eight Roads Ventures? and Acadia Woods Partners. The company was founded by North Carolina State University professor Helen Gu, who spent 15 years researching this problem before launching the startup in 2015.

Gu also announced that she had brought on former Distil Networks co-founder and CEO Rami Essaid to be chief operating officer. Essaid, who sold his company earlier this year, says his new company focuses on taking a proactive approach to application and infrastructure monitoring.

“We found that these problems happen to be repeatable, and the signals are there. We use artificial intelligence to predict and get out ahead of these issues,” he said. He adds that it’s about using technology to be proactive, and he says that today the software can prevent about half of the issues before they even become problems.

If you’re thinking that this sounds a lot like what Splunk, New Relic and Datadog are doing, you wouldn’t be wrong, but Essaid says that these products take a siloed look at one part of the company technology stack, whereas InsightFinder can act as a layer on top of these solutions to help companies reduce alert noise, track a problem when there are multiple alerts flashing and completely automate issue resolution when possible.

“It’s the only company that can actually take a lot of signals and use them to predict when something’s going to go bad. It doesn’t just help you reduce the alerts and help you find the problem faster, it actually takes all of that data and can crunch it using artificial intelligence to predict and prevent [problems], which nobody else right now is able to do,” Essaid said.

For now, the software is installed on-prem at its current set of customers, but the startup plans to create a SaaS version of the product in 2020 to make it accessible to more customers.

The company launched in 2015, and has been building out the product using a couple of National Science Foundation grants before this investment. Essaid says the product is in use today in 10 large companies (which he can’t name yet), but it doesn’t have any true go-to-market motion. The startup intends to use this investment to begin to develop that in 2020.

Nov
27
2019
--

Running PMM1 and PMM2 Clients on the Same Host

Running PMM1 and PMM2 Clients

Running PMM1 and PMM2 ClientsWant to try out Percona Monitoring and Management 2 (PMM 2) but you’re not ready to turn off your PMM 1 environment?  This blog is for you! Keep in mind that the methods described are not intended to be a long-term migration strategy, but rather, simply a way to deploy a few clients in order to sample PMM 2 before you commit to the upgrade. ?

Here are step-by-step instructions for deploying PMM 1 & 2 client functionality i.e. pmm-client and pmm2-client, on the same host.

  1. Deploy PMM 1 on Server1 (you’ve probably already done this)
  2. Install and setup pmm-client for connectivity to Server1
  3. Deploy PMM 2 on Server2
  4. Install and setup pmm2-client for connectivity to Server2
  5. Remove pmm-client and switched completely to pmm2-client

The first few steps are already described in our PMM1 documentation so we are simply providing links to those documents.  Here we’ll focus on steps 4 and 5.

Install and Setup pmm2-client Connectivity to Server2

It’s not possible to install both clients from a repository at the same time. So you’ll need to download a tarball of pmm2-client. Here’s a link to the latest version directly from our site.

Download pmm2-client Tarball

* Note that depending on when you’re seeing this, the commands below may not be for the latest version, so the commands may need to be updated for the version you downloaded.

$ wget https://www.percona.com/downloads/pmm2/2.1.0/binary/tarball/pmm2-client-2.1.0.tar.gz

Extract Files From pmm2-client Tarball

$ tar -zxvf pmm2-client-2.1.0.tar.gz 
$ cd pmm2-client-2.1.0

Register and Generate Configuration File

Now it’s time to set up a PMM 2 client. In our example, the PMM2 server IP is 172.17.0.2 and the monitored host IP is 172.17.0.1.

$ ./bin/pmm-agent setup --config-file=config/pmm-agent.yaml \
--paths-node_exporter="$PWD/pmm2-client-2.1.0/bin/node_exporter" \
--paths-mysqld_exporter="$PWD/pmm2-client-2.1.0/bin/mysqld_exporter" \
--paths-mongodb_exporter="$PWD/pmm2-client-2.1.0/bin/mongodb_exporter" \
--paths-postgres_exporter="$PWD/pmm2-client-2.1.0/bin/postgres_exporter" \
--paths-proxysql_exporter="$PWD/pmm2-client-2.1.0/bin/proxysql_exporter" \
--server-insecure-tls --server-address=172.17.0.2:443 \
--server-username=admin  --server-password="admin" 172.17.0.1 generic node8.ca

Start pmm-agent

Let’s run the pmm-agent using a screen.  There’s no service manager integration when deploying alongside pmm-client, so if your server restarts, pmm-agent won’t automatically resume.

# screen -S pmm-agent

$ ./bin/pmm-agent --config-file="$PWD/config/pmm-agent.yaml"

Check the Current State of the Agent

$ ./bin/pmm-admin list
Service type  Service name         Address and port  Service ID

Agent type                  Status     Agent ID                                        Service ID
pmm-agent                   connected  /agent_id/805db700-3607-40a9-a1fa-be61c76fe755  
node_exporter               running    /agent_id/805eb8f6-3514-4c9b-a05e-c5705755a4be

Add MySQL Service

Detach the screen, then add the mysql service:

$ ./bin/pmm-admin add mysql --use-perfschema --username=root mysqltest
MySQL Service added.
Service ID  : /service_id/28c4a4cd-7f4a-4abd-a999-86528e38992b
Service name: mysqltest

Here is the state of pmm-agent:

$ ./bin/pmm-admin list
Service type  Service name         Address and port  Service ID
MySQL         mysqltest            127.0.0.1:3306    /service_id/28c4a4cd-7f4a-4abd-a999-86528e38992b

Agent type                  Status     Agent ID                                        Service ID
pmm-agent                   connected  /agent_id/805db700-3607-40a9-a1fa-be61c76fe755   
node_exporter               running    /agent_id/805eb8f6-3514-4c9b-a05e-c5705755a4be   
mysqld_exporter             running    /agent_id/efb01d86-58a3-401e-ae65-fa8417f9feb2  /service_id/28c4a4cd-7f4a-4abd-a999-86528e38992b
qan-mysql-perfschema-agent  running    /agent_id/26836ca9-0fc7-4991-af23-730e6d282d8d  /service_id/28c4a4cd-7f4a-4abd-a999-86528e38992b

Confirm you can see activity in each of the two PMM Servers:

PMM 1 PMM 2

Remove pmm-client and Switch Completely to pmm2-client

Once you’ve decided to move over completely to PMM2, it’s better to make a switch from the tarball version to installation from the repository. It will allow you to perform client updates much easier as well as register the new agent as a service for automatically starting with the server. Also, we will show you how to make a switch without re-adding monitored instances.

Configure Percona Repositories

$ sudo yum install https://repo.percona.com/yum/percona-release-latest.noarch.rpm 
$ sudo percona-release disable all 
$ sudo percona-release enable original release 
$ yum list | grep pmm 
pmm-client.x86_64                    1.17.2-1.el6                  percona-release-x86_64
pmm2-client.x86_64                   2.1.0-1.el6                   percona-release-x86_64

Here is a link to the apt variant.

Remove pmm-client

yum remove pmm-client

Install pmm2-client

$ yum install pmm2-client
Loaded plugins: priorities, update-motd, upgrade-helper
4 packages excluded due to repository priority protections
Resolving Dependencies
--> Running transaction check
---> Package pmm2-client.x86_64 0:2.1.0-5.el6 will be installed
...
Installed:
  pmm2-client.x86_64 0:2.1.0-5.el6                                                                                                                                                           

Complete!

Configure pmm2-client

Let’s copy the currently used pmm2-client configuration file in order to omit re-adding monitored instances.

$ cp pmm2-client-2.1.0/config/pmm-agent.yaml /tmp

It’s required to set the new location of exporters (/usr/local/percona/pmm2/exporters/) in the file.

$ sed -i 's|node_exporter:.*|node_exporter: /usr/local/percona/pmm2/exporters/node_exporter|g' /tmp/pmm-agent.yaml
$ sed -i 's|mysqld_exporter:.*|mysqld_exporter: /usr/local/percona/pmm2/exporters/mysqld_exporter|g' /tmp/pmm-agent.yaml
$ sed -i 's|mongodb_exporter:.*|mongodb_exporter: /usr/local/percona/pmm2/exporters/mongodb_exporter|g' /tmp/pmm-agent.yaml 
$ sed -i 's|postgres_exporter:.*|postgres_exporter: /usr/local/percona/pmm2/exporters/postgres_exporter|g' /tmp/pmm-agent.yaml
$ sed -i 's|proxysql_exporter:.*|proxysql_exporter: /usr/local/percona/pmm2/exporters/proxysql_exporter|g' /tmp/pmm-agent.yaml

The default configuration file has to be replaced by our file and the service pmm-agent has to be restarted.

$ cp /tmp/pmm-agent.yaml /usr/local/percona/pmm2/config/
$ systemctl restart pmm-agent

Check Monitored Services

So now we can verify the current state of monitored instances.

$ pmm-admin list

Also, it can be checked on PMM server-side.

Nov
22
2019
--

Tips for Designing Grafana Dashboards

Designing Grafana Dashboards

As Grafana powers our star product – Percona Monitoring and Management (PMM) – we have developed a lot of experience creating Grafana Dashboards over the last few years.   In this article, I will share some of the considerations for designing Grafana Dashboards. As usual, when it comes to questions of design they are quite subjective, and I do not expect you to chose to apply all of them to your dashboards, but I hope they will help you to think through your dashboard design better.

Design Practical Dashboards

Grafana features many panel types, and even more are available as plugins. It may be very attractive to use many of them in your dashboards using many different visualization options. Do not!  Stick to a few data visualization patterns and only add additional visualizations when they provide additional practical value not because they are cool.  Graph and Singlestat panel types probably cover 80% of use cases.

Do Not Place Too Many Graphs Side by Side

This probably will depend a lot on how your dashboards are used.  If your dashboard is designed for large screens placed on the wall you may be able to fit more graphs side by side, if your dashboard needs to scale down to lower resolution small laptop screen I would suggest sticking to 2-3 graphs in a row.

Use Proper Units

Grafana allows you to specify a unit for the data type displayed. Use it! Without type set values will not be properly shortened and very hard to read:

Grafana Dashboards

Compare this to

Grafana Dashboards2

Mind Decimals

You can specify the number of values after decimal points you want to display or leave it default.  I found default picking does not always work very well, for example here:

Grafana Dashboards3

For some reason on the panel Axis, we have way too many values displayed after the decimal point.  Grafana also often picks three values after decimal points as in the table below which I find inconvenient – from the glance view, it is hard to understand if we’re dealing with a decimal point or with “,” as a “thousands” separator, so I may be looking at 2462 GiB there.  While it is not feasible in this case, there are cases such as data rate where a 1000x value difference is quite possible.  Instead, I prefer setting it to one decimal (or one if it is enough) which makes it clear that we’re not looking at thousands.

Label your Axis

You can label your axis (which especially makes sense) if the presentation is something not as obvious as in this example; we’re using a negative value to lot writes to a swap file.

Grafana Dashboards4

Use Shared Crosshair or Tooltip

In Dashboard Settings, you will find “Graph Tooltip” option and set it to “Default”,
“Shared Crosshair” or “Share Tooltip”  This is how these will look:

Grafana Dashboards5

Grafana Dashboards 6

Grafana Dashboards 6

 

Shared crosshair shows the line matching the same time on all dashboards while Tooltip shows the tooltip value on all panels at the same time.  You can pick what makes sense for you; my favorite is using the tooltip setting because it allows me to visually compare the same time without making the dashboard too slow and busy.

Note there is handy shortcut CTRL-O which allows you to cycle between these settings for any dashboard.

Pick Colors

If you’re displaying truly dynamic information you will likely have to rely on Grafana’s automatic color assignment, but if not, you can pick specific colors for all values being plotted.  This will prevent colors from potentially being re-assigned to different values without you planning to do so.

Grafana Dashboards 7

Picking colors you also want to make sure you pick colors that make logical sense. For example, I think for free memory “green” is a better color than “red”.  As you pick the colors, use the same colors for the same type of information when you show it on the different panels if possible, because it makes it easier to understand.

I would even suggest sticking to the same (or similar) color for the Same Kind of Data – if you have many panels which show disk Input and Output using similar colors, this can be a good idea.

Fill Stacking Graphs

Grafana does not require it, but I would suggest you use filling when you display stacking data and don’t use filling when you’re plotting multiple independent values.  Take a look at these graphs:

In the first graph, I need to look at the actual value of the plotted value to understand what I’m looking at. At the same time, in the second graph, that value is meaningless and what is valuable is the filled amount. I can see on the second graph what amount of the Cache, blue value, has shrunk.

I prefer using a fill factor of 6+ so it is easier to match the fill colors with colors in the table.   For the same reason, I prefer not to use the fill gradient on such graphs as it makes it much harder to see the color and the filled volume.

Do Not Abuse Double Axis

Graphs that use double axis are much harder to understand.  I used to use it very often, but now I avoid it when possible, only using it when I absolutely want to limit the number of panels.

Note in this case I think gradient fits OK because there is only one value displayed as the line, so you can’t get confused if you need to look at total value or “filled volume”.

Separate Data of Different Scales on Different Graphs

I used to plot Innodb Rows Read and Written at the same graph. It is quite common to have reads to be 100x higher in volume than writes, crowding them out and making even significant changes in writes very hard to see.  Splitting them to different graphs solved this issue.

Consider Staircase Graphs

In the monitoring applications, we often display average rates computed over a period of time.  If this is the case, we do not know how the rate was changing within that period and it would be misleading to show that. This especially makes sense if you’re displaying only a few data points.

Let’s look at this graph which is being viewed with one-hour resolution:

This visually shows what amount of rows read was falling from 16:00 to 18:00, and if we compare it to the staircase graph:

It simply shows us that the value at 18 am was higher than 17 am, but does not make any claim about the change.

This display, however, has another issue. Let’s look at the same data set with 5min resolution:

We can see the average value from 16:00 to 17:00 was lower than from 17:00 to 18:00, but this is however NOT what the lower resolution staircase graph shows – the value for 17 to 18 is actually lower!

The reason for that is if we compute on Prometheus side rate() for 1 hour at 17:00 it will be returned as a data point for 17:00 where this average rate is really for 16:00 to 17:00, while staircase graph will plot it from 17:00 to 18:00 until a new value is available.  It is off by one hour.

To fix it, you need to shift the data appropriately. In Prometheus, which we use in PMM, I can use an offset operator to shift the data to be displayed correctly:

Provide Multiple Resolutions

I’m a big fan of being able to see the data on the same dashboard with different resolutions, which can be done through a special dashboard variable of type “Interval”.  High-resolution data can provide a great level of detail but can be very volatile.

While lower resolution can hide this level of detail, it does show trends better.

Multiple Aggregates for the Same Metrics

To get even more insights, you can consider plotting the same metrics with different aggregates applied to it:

In this case, we are looking at the same variable – threads_running – but at its average value over a period of time versus max (peak) value. Both of them are meaningful in a different way.

You can also notice here that points are used for the Max value instead of a line. This is in general good practice for highly volatile data, as a plottings line for something which changes wildly is messy and does not provide much value.

Use Help and Panel Links

If you fill out a description for the panel, it will be visible if you place your mouse over the tiny “i” sign. This is very helpful to explain what the panel shows and how to use this data.  You can use Markup for formatting.  You can also provide one or more panel links, that you can use for additional help or drill down.

With newer Grafana versions, you can even define a more advanced drill-down, which can contain different URLs based on the series you are looking at, as well as other templating variables:

Summary

This list of considerations for designing Grafana Dashboards and best practices is by no means complete, but I hope you pick up an idea or two which will allow you to create better dashboards!

Nov
05
2019
--

Chronosphere launches with $11M Series A to build scalable, cloud-native monitoring tool

Chronosphere, a startup from two ex-Uber engineers who helped create the open-source M3 monitoring project to handle Uber-level scale, officially launched today with the goal of building a commercial company on top of the open-source project.

It also announced an $11 million investment led by Greylock, with participation from venture capitalist Lee Fixel.

While the founders, CEO Martin Mao and CTO Rob Skillington, were working at Uber, they recognized a gap in the monitoring industry, particularly around cloud-native technologies like containers and microservices. There weren’t any tools available on the market that could handle Uber’s scaling requirements — so like any good engineers, they went out and built their own.

“We looked around at the market at the time and couldn’t find anything in open source or commercially available that could really scale to our needs. So we ended up building and open sourcing our solution, which is M3. Over the last three to four years we’ve scaled M3 to one of the largest production monitoring systems in the world today,” Mao explained.

The essential difference between M3 and other open-source, cloud-native monitoring solutions like Prometheus is that ability to scale, he says.

One of the main reasons they left to start a company, with the blessing of Uber, was that the community began asking for features that didn’t really make sense for Uber. By launching Chronosphere, Mao and Skillington would be taking on the management of the project moving forward (although sharing governance for the time being with Uber), while building those enterprise features the community has been requesting.

The new company’s first product will be a cloud version of M3 to help reduce some of the complexity associated with managing an M3 project. “M3 itself is a fairly complex piece of technology to run. It is solving a fairly complex problem at large scale, and running it actually requires a decent amount of investment to run at large scale, so the first thing we’re doing is taking care of that management,” Mao said.

Jerry Chen, who led the investment at Greylock, saw a company solving a big problem. “They were providing such a high-resolution view of what’s going on in your cloud infrastructure and doing that at scale at a cost that actually makes sense. They solved that problem at Uber, and I saw them, and I was like wow, the rest of the market needs what guys built and I wrote the Series A check. It was as simple as that,” Chen told TechCrunch.

The cloud product is currently in private beta; they expect to open to public beta early next year.

Nov
01
2019
--

New Relic snags early-stage serverless monitoring startup IOpipe

As we move from a world dominated by virtual machines to one of serverless, it changes the nature of monitoring, and vendors like New Relic certainly recognize that. This morning the company announced it was acquiring IOpipe, a Seattle-based early-stage serverless monitoring startup, to help beef up its serverless monitoring chops. Terms of the deal weren’t disclosed.

New Relic gets what it calls “key members of the team,” which at least includes co-founders Erica Windisch and Adam Johnson, along with the IOpipe technology. The new employees will be moving from Seattle to New Relic’s Portland offices.

“This deal allows us to make immediate investments in onboarding that will make it faster and simpler for customers to integrate their [serverless] functions with New Relic and get the most out of our instrumentation and UIs that allow fast troubleshooting of complex issues across the entire application stack,” the company wrote in a blog post announcing the acquisition.

It adds that initially the IOpipe team will concentrate on moving AWS Lambda features like Lambda Layers into the New Relic platform. Over time, the team will work on increasing support for serverless function monitoring. New Relic is hoping by combining the IOpipe team and solution with its own, it can speed up its serverless monitoring chops.

Eliot Durbin, an investor at Bold Start, which led the company’s $2 million seed round in 2018, says both companies win with this deal. “New Relic has a huge commitment to serverless, so the opportunity to bring IOpipe’s product to their market-leading customer base was attractive to everyone involved,” he told TechCrunch.

The startup has been helping monitor serverless operations for companies running AWS Lambda. It’s important to understand that serverless doesn’t mean there are no servers, but the cloud vendor — in this case AWS — provides the exact resources to complete an operation, and nothing more.

IOpipe co-founders Erica Windisch and Adam Johnson

Photo: New Relic

Once the operation ends, the resources can simply get redeployed elsewhere. That makes building monitoring tools for such ephemeral resources a huge challenge. New Relic has also been working on the problem and released New Relic Serverless for AWS Lambda earlier this year.

As TechCrunch’s Frederic Lardinois pointed out in his article about the company’s $2.5 million seed round in 2017, Windisch and Johnson bring impressive credentials:

IOpipe co-founders Adam Johnson (CEO) and Erica Windisch (CTO), too, are highly experienced in this space, having previously worked at companies like Docker and Midokura (Adam was the first hire at Midokura and Erica founded Docker’s security team). They recently graduated from the Techstars NY program.

IOpipe was founded in 2015, which was just around the time that Amazon was announcing Lambda. At the time of the seed round the company had eight employees. According to PitchBook data, it currently has between 1 and 10 employees, and has raised $7.07 million since its inception.

New Relic was founded in 2008 and has raised more than $214 million, according to Crunchbase, before going public in 2014. Its stock price was $65.42 at the time of publication, up $1.40.

Oct
23
2019
--

MySQL Workbench Review

MySQL Workbench Review

MySQL Workbench ReviewMySQL Workbench is a great multi-purpose GUI tool for MySQL, which I think is not marketed enough by the MySQL team and is not appreciated enough by the community for what it can do.

MySQL Workbench Licensing

MySQL Workbench is similar to MySQL Server and is an Open-Core product. There is Community Edition which has GPL licensed source code on GitHub as well as the “MySQL Workbench Standard Edition (SE)” and “MySQL Workbench Enterprise Edition (EE)”  which are proprietary. The differences between the releases can be found in this document.

In this MySQL Workbench review, I focus on the MySQL Workbench Community Edition, often referred to as MySQL Workbench CE.

Downloading MySQL Workbench

You can download the current version of MySQL Workbench here.

Installing MySQL Workbench

Installation, of course, will be OS-dependent. I installed MySQL Workbench CE on Windows and it was quite uneventful.

Installing MySQL Workbench

 

Starting MySQL Workbench for the first time

If you go through the default MySQL Workbench install process, it will be started upon install completion.  And as it starts, it will check for MySQL Servers running locally, and because I do not have anything running locally, it won’t detect any servers.

Starting MySQL Workbench

 

You need to click on the little “+” sign near the “MySQL Connection” text to add a connection.   I think a clearer link to “Add Connection” by “Rescan Servers” would be more helpful.

Connection options are great. Besides support for TCP/IP and Local Socket/Pipe, MySQL Workbench also has support for TCP/IP over SSH, which is fantastic if you want to connect to servers reachable via SSH but do not have MySQL port open. 

When you have the connection created, you can open the main screen of MySQL Workbench which looks like this:

main screen of MySQL Workbench

You can see there is a lot of stuff there!  Let’s look at some specific features.

MySQL Workbench Management Features

MySQL Workbench Management Features 

Server Status shows information about the running MySQL Server.  Of course, being an Oracle product, it is not designed to detect the alternative implementations.   In this case, Percona Server has a Thread Pool feature but it shows as N/A.  

Server Performance information graphs are updated in real-time and provide some idea about server load.

Client Connections shows current connections to MySQL Server.   This view has some nice features, for example, you can hide sleeping connections and look at running queries only, you can set the view to automatically refresh, and kill some queries and connections. You can also run EXPLAIN for Connection to see the query execution plan.   

Client Connections MySQL Workbench

How EXPLAIN for Connection works is a bit complicated.  As you click on EXPLAIN for Connection, the notebook containing the query opens up, but I would expect to see the explain output at this point:

EXPLAIN for Connection

You when need to click on the EXPLAIN icon to see the Query Explain Output:

Query Explain Output

Note you can get both EXPLAIN for the given query or EXPLAIN for CONNECTION, which can be different, especially in the case of this particular query, where the execution plan was abnormal.

There are multiple displays for EXPLAIN provided, including Tabular Explain and Raw JSON explain, if you want to see these, although having Visual Explain displayed is a unique MySQL Workbench feature.

I also like the feature of MySQL Workbench to provide additional details on connection, such as held locks as well as connection attributes, which can often help to find what particular application instance this query comes from.

Users and Privileges

This MySQL Workbench functionality allows you to view and manage your users:

MySQL Workbench Users

It is not very advanced, but instead for the basic needs of understanding user privileges.  It has built-in support for Administrative Roles but there does not seem to be support for generic roles or some newer features such as locking accounts or requiring a change in password after a certain period of time, etc.

Status and System Variables

The Status and System Variables section in MySQL Workbench shows the formatted output of “SHOW GLOBAL STATUS” and “SHOW VARIABLES”:

Status and System Variables

I like the fact that the massive number of settings and variables are grouped into different categories and there is some help provided.  The fact that all values are only provided as raw numbers, without any formatting and not normalized per second when appropriate, make it hard to work with such information.

Data Export and Data Import/Restore

As you may expect, these provide the functionality to export and import schema and possibly data.  This basically provides GUI for mysqldump, which is a great help for more advanced use cases.

Data Export

Instance Management

This is interesting; even though I set up a connection using SSH, MySQL Workbench does not automatically use it for host access. It needs to be configured separately instead, by clicking the little Wrench icon.

Instance Management

If you’re using Linux for Remote Management, you will need to provide quite a lot of details about the Linux version, packaging type, and even init scripts you use, which can easily be overwhelming.

I do wonder why there is no auto-detection of the system type implemented here.

If you configure Remote Management in MySQL Workbench properly, you could, in theory, be able to start/stop the server, look at server logs, and view options file.   It did not work well in my case. 

Remote Management in MySQL Workbench

Performance  – Dashboard

The MySQL Workbench Performance Dashboard section shows a selection of Performance Graphs.  It is not very deep and only shows stats while MySQL Workbench is running, but it covers some good basics.

MySQL Workbench Performance Dashboard

Performance – Reports

The Performance Reports section in MySQL Workbench is pretty cool; it shows a lot of reports based on MySQL’s sys schema. 

Performance - Reports 

This is pretty handy, but I think it would benefit from having better formatting (so I do not have to count digits to see how much memory is used) and also numbers from the instance start often make little sense.

Performance Schema Setup

This is one of the hidden gems in MySQL Workbench.  Performance Schema configuration in MySQL can be rather complicated if you’re not familiar with it, and MySQL Workbench makes it a lot easier.   Its default Performance Schema controls are very basic.

Performance Schema Setup

However, if you enable the “Show Advanced” settings, it will give you this fantastic overview of Performance Schema: 

As well as allow you to modify the configuration in details:

modify configuration

Until this point, we have been operating in Administration View.  If you want to work with Database Schema, you want to switch MySQL Workbench to Schema View.

Schema View

This view allows you to work with tables and other database schema objects.  The contextual menu provides different functions for different objects.

contextual menu

 

MySQL Workbench Query Editor

Finally, let’s take a look at the MySQL Workbench Query Editor. It has a lot of advanced features.   

First, I like that it is a multi-tab editor so you can have multiple Queries opened at once and you can switch between them easily.  It also has support for helpful snippets – both a large library of built-in ones as well as ones created by the user. It also has support for contextual help with can be quite helpful for beginners.

I like the fact MySQL Workbench adds LIMIT 1000 by default to queries it runs, and it also allows you to easily and conveniently edit the stored data.

Examine Field Types:

View Query execution statistics:

Query execution statistics

Though in this case, it seems to only show information derived from SHOW SESSION STATUS and not more advanced details available in Performance Schema. 

Visual Explain is quite a gem of MySQL Workbench too, but we covered it already.

Summary

In general, I’m quite impressed with the functionality offered with MySQL Workbench CE (Community Edition). For someone looking for a simple, free GUI for MySQL to run queries and provide basic help with administration you need to look no further. If you have more advanced needs, particularly in the monitoring or management space, you should look somewhere else. Oracle has MySQL Enterprise Monitor for this purpose which is a fully commercial product that comes with a MySQL Enterprise subscription.  If you are looking for an Open Source Database Monitoring-focused product, consider Percona Monitoring and Management.  

Oct
22
2019
--

PMM Server + podman: Running a Container Without root Privileges

PMM server and podman

PMM server and podmanIn this article, we will take a look at how to run Percona Monitoring and Management (PMM) Server in a container without root privileges.

Some of the concerns companies have about using Docker relate to the security risks that exist due to the requirement for root privileges in order to run the service and therefore the containers. If processes inside the container are running as root then they are also running as such outside of the container, which means that you should take measures to mitigate privilege escalation issues that could occur from container breakout. Recently, Docker added experimental support for running in rootless mode which requires a number of extra changes.

From a personal perspective, it seems that Docker tries to do it “all in one”, which has its issues. While I was looking to overcome some gripes with Docker, reduce requirements, and also run containers as a chosen user regardless of the process users inside the container, podman and buildah caught my attention, especially as podman also supports Kubernetes. To quote the docs:

Podman is a daemonless container engine for developing, managing, and running OCI Containers on your Linux System. Containers can either be run as root or in rootless mode.

Configuring user namespaces

In order to use containers without the need for root privileges, some initial configuration is likely to be needed to set the mapping of namespaces, as well as a suitable limit. Using the shadow-utils package on CentOS, or uidmap package on Debian, it is possible to assign subordinate user (man subuid) and group IDs (man subgid) to allow the mapping of IDs inside a container to a dedicated set of IDs outside the container. This allows us to resolve the issue where a process is running as root both inside and outside of a container. 

The following is an Ansible task snippet that, on a dedicated CentOS machine, makes the necessary changes, although you may need/want to adjust this if already using namespaces:

- name: install packages
    yum:
      name:
      - runc
      - podman
      - podman-docker
      - buildah
      - shadow-utils
      - slirp4netns
      update_cache: yes
      state: latest
    become: yes
    tags:
    - packages

- name: set user namespace
   copy:
     content: '{{ ansible_user }}:100000:65536'
     dest: '/etc/{{ item }}'
   become: yes
   loop:
     - subuid
     - subgid
   register: set_user_namespace_map
   tags:
   - namespace

 - name: set sysctl user.max_user_namespaces
   sysctl:
     name: user.max_user_namespaces
     value: 100000
     state: present
     sysctl_file: /etc/sysctl.d/01-namespaces.conf
     sysctl_set: yes
     reload: yes
   become: yes
   register: set_user_namespace_sysctl
   tags:
   - namespace

 - name: reboot to apply
   reboot:
     post_reboot_delay: 60
   become: yes
   when: set_user_namespace_map.changed or set_user_namespace_sysctl.changed
   tags:
   - namespace

This set of tasks will:

  1. Install some packages that are either required or useful when using podman.
  2. Configure the 2 files used to map IDs (
    /etc/subuid

    and

    /etc/subgid

     ); we allow the user to have 65536 user IDs mapped, starting the mapping from 100000 and likewise for group IDs.

  3. Set
    user.max_user_namespaces

      to ensure that you can allocate sufficient IDs, making it persistent after a reboot. If you have needed to configure the namespaces then it is likely that you need to reboot to make sure that the changes are active.

Look Ma, PMM — and no root!

OK, so if all is well, then the system is configured to support user namespaces and we can run a container without needing to be root (or a member of a special group). The commands for podman are identical, or very close to those that you would use for Docker, and you can even set

alias docker=podman

  if you wish! Here is how you could test PMM Server (v2) out, mapping port 8443 to the NGINX port inside the container:

$ whoami
percona

$ podman run -d --name pmm2-test -p 8443:443 docker.io/percona/pmm-server:2

In the previous command, the path to the registry is explicitly stated as being a Docker one, but if you were to simply specify percona/pmm-server:2 then by default a number of registries are checked and the first match will win. The list and order of the registries differ per distro, e.g.

  • Ubuntu:
    registries = ['quay.io', 'docker.io']
  • RedHat:
    registries = ['registry.redhat.io', 'quay.io', 'docker.io']

Checking the processes

The container for PMM server is now running as the user that executed the podman command, so we can take a look and see what the processes look like. First, let’s take a peek inside the container:

$ podman exec pmm-server bash -c "ps -e -o pid,ppid,pgrp,user,comm --forest"
  PID  PPID  PGRP USER     COMMAND
  862     0   862 root     ps
    1     0     1 root     supervisord
    9     1     9 postgres postgres
   88     9    88 postgres  \_ postgres
   98     9    98 postgres  \_ postgres
  100     9   100 postgres  \_ postgres
  101     9   101 postgres  \_ postgres
  102     9   102 postgres  \_ postgres
  103     9   103 postgres  \_ postgres
  104     9   104 postgres  \_ postgres
  154     9   154 postgres  \_ postgres
  173     9   173 postgres  \_ postgres
  183     9   183 postgres  \_ postgres
   10     1    10 root     clickhouse-serv
   11     1    11 grafana  grafana-server
   12     1    12 root     nginx
   23    12    12 nginx     \_ nginx
   13     1    13 root     crond
   14     1    14 pmm      prometheus
   17     1    17 root     pmm-managed
   51    17    17 root      \_ supervisorctl
   18     1    18 root     pmm-agent
  172    18    18 root      \_ node_exporter
  174    18    18 root      \_ postgres_export
  153     1   153 pmm      percona-qan-api

We can see all of the processes that run as part of PMM, plus the ps command that was executed to look at the processlist.

Now, let’s take a look outside the container to see what these processes show up as:

$ pgrep -f 'postgres|supervisord|clickhouse-serv|pmm-managed' | xargs ps -o pid,ppid,pgrp,user,comm --forest
  PID  PPID  PGRP USER     COMMAND
32020 32003 32003 percona  node_exporter
23532 23358 23358 percona  postgres_export
23530 23358 23358 percona  node_exporter
23332 23321 23332 percona  supervisord
23349 23332 23349 100025    \_ postgres
23440 23349 23440 100025    |   \_ postgres
23456 23349 23456 100025    |   \_ postgres
23458 23349 23458 100025    |   \_ postgres
23459 23349 23459 100025    |   \_ postgres
23460 23349 23460 100025    |   \_ postgres
23461 23349 23461 100025    |   \_ postgres
23462 23349 23462 100025    |   \_ postgres
23512 23349 23512 100025    |   \_ postgres
23531 23349 23531 100025    |   \_ postgres
23541 23349 23541 100025    |   \_ postgres
23350 23332 23350 percona   \_ clickhouse-serv
23357 23332 23357 percona   \_ pmm-managed

Great! As we can see from the processlist, none of the processes that we checked are running as root, the ones that are running as root inside the container are running as percona, and the remainder are using unknown user IDs generated by the subordinate mapping.

Using persistent volumes for your data

The documented way to use PMM with persistent volumes does not work in the same manner with podman. However, it is easily fixed as follows (plus we get to try out buildah):

$ cat <<EOS > PMM2.Dockerfile
FROM docker.io/percona/pmm-server:2

VOLUMES ['/srv']
EOS

$ buildah bud -f PMM2.Dockerfile --tag localhost/pmm-server:custom .
$ podman create --name pmm-data -v /srv localhost/pmm-server:custom 
$ podman run -d --name pmm-server -v pmm-data:/srv -p 8443:443 localhost/pmm-server:custom

Summary

It is easy to get PMM server running in a container with podman and buildah while also increasing the security from the outside by restricting the user privileges for the processes.

And while you may not be able to replace Docker usage in certain circumstances, taking a look at podman and buildah is definitely recommended… and you can still use the Docker registry if that is the only location for the image that you want to run.

If you haven’t tried PMM2 yet then you can easily test it out or take a look at the online demo.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com