Sep
30
2018
--

Talking Drupal #178 – Oomph Paragraphs

In episode #178 we talk with J Hogue from Oomph about their contributed modules Oomph Paragraphs.  www.talkingdrupal.com/178

Topics

  • What is Oomph Paragraph
  • What is the origin story for this module
  • Who would use Oomph Paragraphs?
  • Why contribute?
  • Why name it oomph paragraphs

Resources

https://www.drupal.org/project/oomph_paragraphs

Hosts

Stephen Cross – www.ParallaxInfoTech.com @stephencross

John Picozzi – www.oomphinc.com @johnpicozzi

Nic Laflin – www.nLighteneddevelopment.com @nicxvan

 J Hogue – www.oomphinc.com @artinruins

Sep
29
2018
--

What each cloud company could bring to the Pentagon’s $10 B JEDI cloud contract

The Pentagon is going to make one cloud vendor exceedingly happy when it chooses the winner of the $10 billion, ten-year enterprise cloud project dubbed the Joint Enterprise Defense Infrastructure (or JEDI for short). The contract is designed to establish the cloud technology strategy for the military over the next 10 years as it begins to take advantage of current trends like Internet of Things, artificial intelligence and big data.

Ten billion dollars spread out over ten years may not entirely alter a market that’s expected to reach $100 billion a year very soon, but it is substantial enough give a lesser vendor much greater visibility, and possibly deeper entree into other government and private sector business. The cloud companies certainly recognize that.

Photo: Glowimages/Getty Images

That could explain why they are tripping over themselves to change the contract dynamics, insisting, maybe rightly, that a multi-vendor approach would make more sense.

One look at the Request for Proposal (RFP) itself, which has dozens of documents outlining various criteria from security to training to the specification of the single award itself, shows the sheer complexity of this proposal. At the heart of it is a package of classified and unclassified infrastructure, platform and support services with other components around portability. Each of the main cloud vendors we’ll explore here offers these services. They are not unusual in themselves, but they do each bring a different set of skills and experiences to bear on a project like this.

It’s worth noting that it’s not just interested in technical chops, the DOD is also looking closely at pricing and has explicitly asked for specific discounts that would be applied to each component. The RFP process closes on October 12th and the winner is expected to be chosen next April.

Amazon

What can you say about Amazon? They are by far the dominant cloud infrastructure vendor. They have the advantage of having scored a large government contract in the past when they built the CIA’s private cloud in 2013, earning $600 million for their troubles. It offers GovCloud, which is the product that came out of this project designed to host sensitive data.

Jeff Bezos, Chairman and founder of Amazon.com. Photo: Drew Angerer/Getty Images

Many of the other vendors worry that gives them a leg up on this deal. While five years is a long time, especially in technology terms, if anything, Amazon has tightened control of the market. Heck, most of the other players were just beginning to establish their cloud business in 2013. Amazon, which launched in 2006, has maturity the others lack and they are still innovating, introducing dozens of new features every year. That makes them difficult to compete with, but even the biggest player can be taken down with the right game plan.

Microsoft

If anyone can take Amazon on, it’s Microsoft. While they were somewhat late the cloud they have more than made up for it over the last several years. They are growing fast, yet are still far behind Amazon in terms of pure market share. Still, they have a lot to offer the Pentagon including a combination of Azure, their cloud platform and Office 365, the popular business suite that includes Word, PowerPoint, Excel and Outlook email. What’s more they have a fat contract with the DOD for $900 million, signed in 2016 for Windows and related hardware.

Microsoft CEO, Satya Nadella Photo: David Paul Morris/Bloomberg via Getty Images

Azure Stack is particularly well suited to a military scenario. It’s a private cloud you can stand up and have a mini private version of the Azure public cloud. It’s fully compatible with Azure’s public cloud in terms of APIs and tools. The company also has Azure Government Cloud, which is certified for use by many of the U.S. government’s branches, including DOD Level 5. Microsoft brings a lot of experience working inside large enterprises and government clients over the years, meaning it knows how to manage a large contract like this.

Google

When we talk about the cloud, we tend to think of the Big Three. The third member of that group is Google. They have been working hard to establish their enterprise cloud business since 2015 when they brought in Diane Greene to reorganize the cloud unit and give them some enterprise cred. They still have a relatively small share of the market, but they are taking the long view, knowing that there is plenty of market left to conquer.

Head of Google Cloud, Diane Greene Photo: TechCrunch

They have taken an approach of open sourcing a lot of the tools they used in-house, then offering cloud versions of those same services, arguing that who knows better how to manage large-scale operations than they do. They have a point, and that could play well in a bid for this contract, but they also stepped away from an artificial intelligence contract with DOD called Project Maven when a group of their employees objected. It’s not clear if that would be held against them or not in the bidding process here.

IBM

IBM has been using its checkbook to build a broad platform of cloud services since 2013 when it bought Softlayer to give it infrastructure services, while adding software and development tools over the years, and emphasizing AI, big data, security, blockchain and other services. All the while, it has been trying to take full advantage of their artificial intelligence engine, Watson.

IBM Chairman, President and CEO Ginni Romett Photo: Ethan Miller/Getty Images

As one of the primary technology brands of the 20th century, the company has vast experience working with contracts of this scope and with large enterprise clients and governments. It’s not clear if this translates to its more recently developed cloud services, or if it has the cloud maturity of the others, especially Microsoft and Amazon. In that light, it would have its work cut out for it to win a contract like this.

Oracle

Oracle has been complaining since last spring to anyone who will listen, including reportedly the president, that the JEDI RFP is unfairly written to favor Amazon, a charge that DOD firmly denies. They have even filed a formal protest against the process itself.

That could be a smoke screen because the company was late to the cloud, took years to take it seriously as a concept, and barely registers today in terms of market share. What it does bring to the table is broad enterprise experience over decades and one of the most popular enterprise databases in the last 40 years.

Larry Ellison, chairman of Oracle Corp.

Larry Ellison, chairman of Oracle. Photo: David Paul Morris/Bloomberg via Getty Images

It recently began offering a self-repairing database in the cloud that could prove attractive to DOD, but whether its other offerings are enough to help it win this contract remains to be to be seen.

Sep
28
2018
--

High Availability for Enterprise-Grade PostgreSQL environments

High Availability PostgreSQL

PostgreSQL logoHigh availability (HA) and database replication is a major topic of discussion for database technologists. There are a number of informed choices to be made to optimize PostgreSQL replication so that you achieve HA. In this post we introduce an overview of the topic, and cover some options available to achieve high availability in PostgreSQL. We’ll then focus in on just one way to implement HA for postgres, using Patroni.  

In our previous blog posts, we have discussed the features available to build a secured PostgreSQL environment and the tools available to help you set up a reliable backup strategy. The series of articles is designed to provide a flavor of how you might go about building an enterprise-grade PostgreSQL environment using open source tools. If you’d like to see this implemented, then please do check our our webinar presentation of October 10 – we think you might find it both useful and intriguing! 

Replication in PostgreSQL

The first step towards achieving high availability is making sure you don’t rely on a single database server: your data should be replicated to at least one standby replica/slave. Database replication can be done using the two options available with PostgreSQL community software:

  1. Streaming replication
  2. Logical replication & logical decoding

When we setup streaming replication, a standby replica connects to the master (primary) and streams WAL records from it. Streaming replication is considered to be one of the safest and fastest methods for replication in PostgreSQL. A standby server becomes an exact replica of the primary with potentially minimal lag between primary and standby even on very busy transactional servers. PostgreSQL allows you to build synchronous and asynchronous replication while in streaming replication. Synchronous replication ensures that a client is given a success message only when the change is not only committed to the master but also successfully replicated on the standby server as well. As standby servers can accept read requests from clients, we can make more efficient use of our PostgreSQL setup by sparing the master from serving read requests and redirecting these to the replicas instead. You can read more about Streaming Replication in this blog post

Logical replication in PostgreSQL allows users to perform a selective replication of a subset of the tables found in the master. While streaming replication is implemented in PostgreSQL at block level—where every database in the master gets replicated to the replica, which remains read-only—logical replication suits such unique scenarios where you need to perform replication of a selection of tables in a database and (optionally) allow direct writes to your slave. A slave configured with logical replication can also be configured to replicate from multiple masters. One situation where this is helpful is when you need to replicate data from several PostgreSQL databases to a single PostgreSQL server for reporting and data warehousing tasks.

While it’s technically possible to employ standby servers configured with logical replication in an HA environment, this doesn’t fare well as a best practice. For such usage, a standby server should be able to take the place of another server “transparently” – the more they resemble the master the better. Logical replication opens the door for different data to be replicated to different servers, and this may break things

Here is a list of built-in features available in PostgreSQL that are designed to help achieve high availability:

  • Streaming replication
  • Cascaded replication
  • Asynchronous standby
  • Synchronous standby
  • Warm standby
  • Hot standby
  • pg_rewind and pg_basebackup 

Automatic failover and an always-on strategy

There are many more open source solutions that can help us achieve high availability with PostgreSQL, especially during critical moments, when a master (primary server) becomes unavailable. The following is a list of a few such open source solutions built for PostgreSQL: 

  1. Patroni
  2. Stolon
  3. repmgr
  4. PostgreSQL Automatic Failover (PAF)
  5. pglookout
  6. pgPool-II

However, the HA solutions available for PostgreSQL are not just limited to the list above. We would be interested to hear what you have implemented as an HA solution in the comments section below.

In our upcoming webinar we are going to show you a PostgreSQL replication cluster built using Patroni and how it provides a seamless failover that is transparent to the application. 

Patroni

Patroni is a PostgreSQL cluster management template/framework which stores and talks to a distributed consensus key-value store and decides on the state of the cluster. It started as a fork of Governor project. A Patroni PostgreSQL cluster is composed of many individual PostgreSQL instances running on bare metal, containers or virtual machines. In our setup we’ve used etcd for consensus management, which handles leader elections and decides the leader among a cluster of servers that are partitioned by network. This distributed consensus management can also be achieved by using other technologies, such as ZooKeeper and Consul. In the event of a failover, Patroni promotes the slave that has been assigned as a leader by etcd-like consensus key-value stores. Note that when we setup asynchronous replication, you have an option to specify maximum_lag_on_failover to restrict it from promoting a slave that is lagging by more than this value.

Here’s an architecture diagram of Patroni :  

Key Benefits of Patroni:

  1. Continuous monitoring and automatic failover
  2. Manual/scheduled switchover with a single command
  3. Built-in automation for bringing back a failed node to cluster again.
  4. REST APIs for entire cluster configuration and further tooling.
  5. Provides infrastructure for transparent application failover
  6. Distributed consensus for every action and configuration.
  7. Integration with Linux watchdog for avoiding split-brain syndrome.
If you found this post useful…

You are sure to enjoy our webinar of October 10, where we demonstrate live how to build an enterprise-grade PostgreSQL environment with open source tools. If you make it to the live presentation, you will also have the chance to ask questions of the team.

In the next blog post of this series we’ll be covering the scalability of our solution and how to accommodate an increase in traffic while maintaining the quality of the service. We’re moving ever closer to an enterprise-grade environment with open source tools!

The post High Availability for Enterprise-Grade PostgreSQL environments appeared first on Percona Database Performance Blog.

Sep
28
2018
--

This Week in Data with Colin Charles #54: Percona Server for MySQL is Alpha

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

I consider this to be the biggest news for the week: Alpha Build of Percona Server for MySQL 8.0. Experiment with it in a Docker container. It is missing column compression with dictionary support, native partitioning for TokuDB and MyRocks (excited to see that this is coming!), and encryption key rotation and scrubbing. All in, this should be a fun release to try, test, and also to file bugs for!

Database paradigms are changing, and it is interesting to see Cloudflare introducing Workers KV a key-value store, that is eventually consistent and highly distributed (at their global network of 152+ data centers). You can have up to 1 billion keys per namespace, keys up to 2kB in size, values up to 64kB, and eventual global consistency within 10 seconds. Read more about the cost and other technicals too.

For some quick glossing, from a MySQL Federal Account Manager, comes Why MySQL is Harder to Sell Than Oracle (from someone who has done both). Valid concerns, and always interesting to hear the barriers MySQL faces even after 23 years in existence! For analytics, maybe this is where the likes of MariaDB ColumnStore or ClickHouse might come into play.

Lastly, for all of you asking me about when Percona Live Europe Frankfurt 2018 speaker acceptances and agendas are to be released, I am told by a good source that it will be announced early next week. So register already!

Releases

Link List

Upcoming Appearances

Feedback

I look forward to feedback/tips via Twitter @bytebot.

The post This Week in Data with Colin Charles #54: Percona Server for MySQL is Alpha appeared first on Percona Database Performance Blog.

Sep
28
2018
--

Scaling Percona Monitoring and Management (PMM)

PMM tested with 1000 nodes

Starting with PMM 1.13,  PMM uses Prometheus 2 for metrics storage, which tends to be heaviest resource consumer of CPU and RAM.  With Prometheus 2 Performance Improvements, PMM can scale to more than 1000 monitored nodes per instance in default configuration. In this blog post we will look into PMM scaling and capacity planning—how to estimate the resources required, and what drives resource consumption.

PMM tested with 1000 nodes

We have now tested PMM with up to 1000 nodes, using a virtualized system with 128GB of memory, 24 virtual cores, and SSD storage. We found PMM scales pretty linearly with the available memory and CPU cores, and we believe that a higher number of nodes could be supported with more powerful hardware.

What drives resource usage in PMM ?

Depending on your system configuration and workload, a single node can generate very different loads on the PMM server. The main factors that impact the performance of PMM are:

  1. Number of samples (data points) injected into PMM per second
  2. Number of distinct time series they belong to (cardinality)
  3. Number of distinct query patterns your application uses
  4. Number of queries you have on PMM, through the user interface on the API, and their complexity

These specifically can be impacted by:

  • Software version – modern database software versions expose more metrics)
  • Software configuration – some metrics are only exposed in certain configuration
  • Workload – a large number of database objects and high concurrency will increase both the number of samples ingested and their cardinality.
  • Exporter configuration – disabling collectors can reduce amount of data collectors
  • Scrape frequency –  controlled by METRICS_RESOLUTION

All these factors together may impact resource requirements by a factor of ten or more, so do your own testing to be sure. However, the numbers in this article should serve as good general guidance as a start point for your research.

On the system supporting 1000 instances we observed the following performance:

Performance PMM 1000 nodes load

As you can see, we have more than 2.000 scrapes/sec performed, providing almost two million samples/sec, and more than eight million active time series. These are the main numbers that define the load placed on Prometheus.

Capacity planning to scale PMM

Both CPU and memory are very important resources for PMM capacity planning. Memory is the more important as Prometheus 2 does not have good options for limiting memory consumption. If you do not have enough memory to handle your workload, then it will run out of memory and crash.

We recommend at least 2GB of memory for a production PMM Installation. A test installation with 1GB of memory is possible. However, it may not be able to monitor more than one or two nodes without running out of memory. With 2GB of memory you should be able to monitor at least five nodes without problem.

With powerful systems (8GB of more) you can have approximately eight systems per 1GB of memory, or about 15,000 samples ingested/sec per 1GB of memory.

To calculate the CPU usage resources required, allow for about 50 monitored systems per core (or 100K metrics/sec per CPU core).

One problem you’re likely to encounter if you’re running PMM with 100+ instances is the “Home Dashboard”. This becomes way too heavy with such a large number of servers. We plan to fix this issue in future releases of PMM, but for now you can work around it in two simple ways:

You can select the host, for example “pmm-server” in your home dashboard and save it, before adding a large amount of hosts to the system.

set home dashboard for PMM

Or you can make some other dashboard of your choice and set it as the home dashboard.

Summary

  • More than 1,000 monitored systems is possible per single PMM server
  • Your specific workload and configuration may significantly change the resources required
  • If deploying with 8GB or more, plan 50 systems per core, and eight systems per 1GB of RAM

The post Scaling Percona Monitoring and Management (PMM) appeared first on Percona Database Performance Blog.

Sep
27
2018
--

Announcement: Alpha Build of Percona Server 8.0

Percona Server for MySQL

Percona server for MySQLAlpha Build of Percona Server 8.0 released

An alpha version of Percona Server 8.0 is now available in the Percona experimental software repositories. This is a 64-bit release only. 

You may experiment with this alpha release by running it in a Docker container:

$ docker run -d -e MYSQL_ROOT_PASSWORD=password -p 3306:3306 perconalab/percona-server:8.0.12.alpha

When the container starts, connect to it as follows:

$ docker exec -ti $(docker ps | grep -F percona-server:8.0.12.alpha | awk '{print $1}') mysql -uroot -ppassword

Note that this release is not ready for use in any production environment.

Percona Server 8.0 alpha is available for the following platforms:

  • RHEL/Centos 6.x
  • RHEL/Centos 7.x
  • Ubuntu 16.04 Xenial
  • Ubuntu 18.04 Bionic
  • Debian 8 Jessie
  • Debian 9 Stretch

Note: The list of supported platforms may be different in the GA release.

Fixed Bugs:

  • PS-4814: TokuDB ‘fast’ replace into is incompatible with 8.0 row replication
  • PS-4834: The encrypted system tablespace has empty uuid

Other fixed bugs: PS-4788PS-4631PS-4736, PS-4818PS-4755

Unfinished Features

The following features are work in progress and are not yet in a working state:

  • Column compression with Data Dictionaries
  • Native Partitioning for TokuDB and for MyRocks
  • Encryption
    • Key Rotation
    • Scrubbing

Known Issues

  • PS-4788: Setting log_slow_verbosity and enabling the slow_query_log could lead to a server crash
  • PS-4803: ALTER TABLE … ADD INDEX … LOCK crash | handle_fatal_signal (sig=11) in dd_table_has_instant_cols
  • PS-4896: handle_fatal_signal (sig=11) in THD::thread_id likely due to enabling innodb_print_lock_wait_timeout_info
  • PS-4820: PS crashes with keyring_vault encryption
  • PS-4796: 8.0 DD and atomic DDL breaks DROP DATABASE for engines that store files in database directory
  • PS-4898: Crash during PAM authentication plugin installation.
  • PS-1782: Optimizer chooses wrong plan when joining 2 tables
  • PS-4850: Toku hot backup plugin dumps tons of info to stdout with no way to disable it
  • PS-4797: rpl.rpl_master_errors failing, likely due to binlog encryption
  • PS-4800: Recovery of prepared XA transactions seems broken in 8.0
  • PS-4853: Installing audit_log plugin causes server to crash
  • PS-4855: Replace http with https in http://bugs.percona.com in server crash messages
  • PS-4857: Improve error message handling for compressed columns
  • PS-4895: Improve error message when encrypted system tablespace was started without keyring plugin
  • PS-3944: Single variable to control logging in QRT
  • PS-4705: crash on snapshot size check in RocksDB
  • PS-4885: Using ALTER … ROW_FORMAT=TOKUDB_QUICKLZ leads to InnoDB: Assertion failure: ha_innodb.cc:12198:m_form->s->row_type == m_create_info->row_type

The post Announcement: Alpha Build of Percona Server 8.0 appeared first on Percona Database Performance Blog.

Sep
27
2018
--

Alphabet’s Chronicle launches an enterprise version of VirusTotal

VirusTotal, the virus and malware scanning service own by Alphabet’s Chronicle, launched an enterprise-grade version of its service today. VirusTotal Enterprise offers significantly faster and more customizable malware search, as well as a new feature called Private Graph, which allows enterprises to create their own private visualizations of their infrastructure and malware that affects their machines.

The Private Graph makes it easier for enterprises to create an inventory of their internal infrastructure and users to help security teams investigate incidents (and where they started). In the process of building this graph, VirtusTotal also looks are commonalities between different nodes to be able to detect changes that could signal potential issues.

The company stresses that these graphs are obviously kept private. That’s worth noting because VirusTotal already offered a similar tool for its premium users — the VirusTotal Graph. All of the information there, however, was public.

As for the faster and more advanced search tools, VirusTotal notes that its service benefits from Alphabet’s massive infrastructure and search expertise. This allows VirusTotal Enterprise to offers a 100x speed increase, as well as better search accuracy. Using the advanced search, the company notes, a security team could now extract the icon from a fake application, for example, and then return all malware samples that share the same file.

VirusTotal says that it plans to “continue to leverage the power of Google infrastructure” and expand this enterprise service over time.

Google acquired VirusTotal back in 2012. For the longest time, the service didn’t see too many changes, but earlier this year, Google’s parent company Alphabet moved VirusTotal under the Chronicle brand and the development pace seems to have picked up since.

Sep
27
2018
--

Dropbox overhauls internal search to improve speed and accuracy

Over the last several months, Dropbox has been undertaking an overhaul of its internal search engine for the first time since 2015. Today, the company announced that the new version, dubbed Nautilus, is ready for the world. The latest search tool takes advantage of a new architecture powered by machine learning to help pinpoint the exact piece of content a user is looking for.

While an individual user may have a much smaller body of documents to search across than the World Wide Web, the paradox of enterprise search says that the fewer documents you have, the harder it is to locate the correct one. Yet Dropbox faces of a host of additional challenges when it comes to search. It has more than 500 million users and hundreds of billions of documents, making finding the correct piece for a particular user even more difficult. The company had to take all of this into consideration when it was rebuilding its internal search engine.

One way for the search team to attack a problem of this scale was to put machine learning to bear on it, but it required more than an underlying level of intelligence to make this work. It also required completely rethinking the entire search tool from an architectural level.

That meant separating two main pieces of the system, indexing and serving. The indexing piece is crucial of course in any search engine. A system of this size and scope needs a fast indexing engine to cover the number of documents in a whirl of changing content. This is the piece that’s hidden behind the scenes. The serving side of the equation is what end users see when they query the search engine, and the system generates a set of results.

Nautilus Architecture Diagram: Dropbox

Dropbox described the indexing system in a blog post announcing the new search engine: “The role of the indexing pipeline is to process file and user activity, extract content and metadata out of it, and create a search index.” They added that the easiest way to index a corpus of documents would be to just keep checking and iterating, but that couldn’t keep up with a system this large and complex, especially one that is focused on a unique set of content for each user (or group of users in the business tool).

They account for that in a couple of ways. They create offline builds every few days, but they also watch as users interact with their content and try to learn from that. As that happens, Dropbox creates what it calls “index mutations,” which they merge with the running indexes from the offline builds to help provide ever more accurate results.

The indexing process has to take into account the textual content assuming it’s a document, but it also has to look at the underlying metadata as a clue to the content. They use this information to feed a retrieval engine, whose job is to find as many documents as it can, as fast it can and worry about accuracy later.

It has to make sure it checks all of the repositories. For instance, Dropbox Paper is a separate repository, so the answer could be found there. It also has to take into account the access-level security, only displaying content that the person querying has the right to access.

Once it has a set of possible results, it uses machine learning to pinpoint the correct content. “The ranking engine is powered by a [machine learning] model that outputs a score for each document based on a variety of signals. Some signals measure the relevance of the document to the query (e.g., BM25), while others measure the relevance of the document to the user at the current moment in time,” they explained in the blog post.

After the system has a list of potential candidates, it ranks them and displays the results for the end user in the search interface, but a lot of work goes into that from the moment the user types the query until it displays a set of potential files. This new system is designed to make that process as fast and accurate as possible.

Sep
27
2018
--

Percona Server for MongoDB 3.4.17-2.15 Is Now Available

Percona Server for MongoDB

Percona Server for MongoDB 3.4Percona announces the release of Percona Server for MongoDB 3.4.17-2.15 on September 27, 2018. Download the latest version from the Percona website or the Percona Software Repositories.

Percona Server for MongoDB 3.4 is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 3.4 Community Edition. It supports MongoDB 3.4 protocols and drivers.

Percona Server for MongoDB extends MongoDB Community Edition functionality by including the Percona Memory Engine and MongoRocks storage engines, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.17. There are no additional improvements or new features on top of those upstream fixes.

The post Percona Server for MongoDB 3.4.17-2.15 Is Now Available appeared first on Percona Database Performance Blog.

Sep
27
2018
--

Automating MongoDB Log Rotation

MongoDB Log Rotation

MongoDB Log RotationIn this blog post, we will look at how to do MongoDB® log rotation in the right—and simplest—way.

Log writing is important for any application to track history. But when the log file size grows larger, it can cause disk space issues. For database servers especially, it may cause performance issues as the database needs to write to a large file continuously. By scheduling regular log rotation, we can avoid such situations proactively and keep the log file size below a predetermined threshold.

MongoDB Log File

In MongoDB, the log is not rotated automatically so we need to rotate it manually. Usually, the log size of the MongoDB server depends on the level of information configured and the slow log configured. By default, commands taking more than 100ms, or whatever the value set for the slowOpThresholdMs parameter, are written into the MongoDB log file. Let’s see how to automate log rotation on Linux based servers.

Rotate MongoDB Log Methods

The following two methods could be used to rotate the log in MongoDB. The first uses the command shown below from within mongo shell:

> use admin
> db.adminCommand( { logRotate : 1 } )

The alternative is to use SIGUSR1 signal to rotate the logs for a single process in Linux/Unix-based systems:

# kill -SIGUSR1 ${pidof mongod)

The behaviour of log rotation in MongoDB differs according to the value of the parameter logRotate which was introduced in version 3.0 (note that this should not be confused with the logRotate command that we’ve seen above). The two values are:

  • rename – renames the log file and creates a new file specified by the logpath parameter to write further logs
  • reopen – closes and reopens the log file following the typical Linux/Unix log rotation behavior. You also need to enable logAppend if you choose reopen. You should use reopen when using the Linux/Unix log rotate utility to avoid log loss.

In versions 2.6 and earlier, the default behavior when issuing the logRotate command is the same as when using rename, i.e. it renames the original log file from mongod.log to mongod.log.xxxx-xx-xxTxx-xx-xx format where x is filled with the rotation date time, and creates a new log file mongod.log to continue to write logs. You can see an example of this below:

mongodb-osx-x86_64-3.2.19/bin/mongod --fork \
  --dbpath=/usr/local/opt/mongo/data6 \
  --logpath=/usr/local/opt/mongo/data6/mongod.log \
  --logappend --logRotate rename
about to fork child process, waiting until server is ready for connections.
forked process: 59324
child process started successfully, parent exiting
ls -ltrh data6/mongod.log*
-rw-r--r-- 1 vinodhkrish admin 4.9K Sep 14 16:57 mongod.log
mongo
MongoDB shell version: 3.2.19
connecting to: test
>
> db.adminCommand( { logRotate : 1 } )
{ "ok" : 1 }
> exit
bye
ls -ltrh data6/mongod.log*
-rw-r--r-- 1 vinodhkrish admin 4.9K Sep 14 16:57 mongod.log.2018-09-14T11-29-54
-rw-r--r-- 1 vinodhkrish admin 1.0K Sep 14 16:59 mongod.log

Using the above method, we need to compress and move the rotated log file manually. You can of course write a script to automate this. But when using the logrotate=reopen option, the mongod.log is just closed and opened again. In this case, you need to use the command alongside with Linux’s logrotate utility to avoid the loss of log writing in the course of the log rotation operation. We will see more about this in the next section.

Automating MongoDB logRotate using the logrotate utility

I wasn’t a fan of this second method for long time! But MongoDB log rotation seems to work well when using Linux/Unix’s logrotate tool. Now I prefer this approach, since it doesn’t need the complex script writing that’s needed for the first log rotation method described above. Let’s see in detail how to configure log rotation with Linux/Unix’s logrotate utility.

MongoDB 3.x versions

Start MongoDB with the following options:

systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log
  logRotate: reopen
processManagement:
  pidFilePath: /var/run/mongodb/mongod.pid

As mentioned in the section about Linux’s logrotation utility, you need to create a separate config file /etc/logrotate.d/mongod.conf for MongoDB’s log file rotation. Add the content shown below into that config file:

/var/log/mongodb/mongod.log {
  daily
  size 100M
  rotate 10
  missingok
  compress
  delaycompress
  notifempty
  create 640 mongod mongod
  sharedscripts
  postrotate
    /bin/kill -SIGUSR1 `cat /var/run/mongodb/mongod.pid 2>/dev/null` >/dev/null 2>&1
  endscript
}

In this config file, we assume that log path is set as /var/log/mongodb/mongod.log in /etc/mongod.conf file, and instruct Linux’s logrotation  utility to do the following:

  • Check the size, and start rotation if the log file is greater than 100M
  • Move the mongod.log file to mongod.log.1
  • Create a new mongod.log file with mongod permissions
  • Compress the files from mongod.log.2 but retain up to mongod.log.10 as per delaycompress and rotate 10
  • MongoDB continues to write to the old file mongod.log.1 (based on Linux’s inode) – remember that now there is no mongod.log file
  • In postrotate, it sends the kill -SIGUSR1 signal to mongod mentioned with the pid file, and thus mongod creates the mongod.log and starts writing to it. So make sure you have the pid file path set to the same as pidFilepath from the /etc/mongod.conf file

Please test the logrotate manually using the created /etc/logrotate.d/mongod.conf file to make sure it is working as expected. Here’s how:

cd /var/log/mongodb/
ls -ltrh
total 4.0K
-rw-r-----. 1 mongod mongod 1.3K Sep 14 12:45 mongod.log
logrotate -f /etc/logrotate.d/mongod
ls -ltrh
total 8.0K
-rw-r-----. 1 mongod mongod 1.4K Sep 14 12:58 mongod.log.1
-rw-r-----. 1 mongod mongod 1.3K Sep 14 12:58 mongod.log
logrotate -f /etc/logrotate.d/mongod
ls -ltrh
total 12K
-rw-r-----. 1 mongod mongod 491 Sep 14 12:58 mongod.log.2.gz
-rw-r-----. 1 mongod mongod 1.4K Sep 14 12:58 mongod.log.1
-rw-r-----. 1 mongod mongod 1.3K Sep 14 12:58 mongod.log

 

Adaptations for MongoDB 2.x and earlier, or when using logRotate=rename

Since the introduction of logRotate parameter in MongoDB 3.0, the log rotate script needs an extra step when you are using logRotate=rename or when using <=2.x versions.

Start the MongoDB with the following options (for 2.4 and earlier, ) :

logAppend=true
logpath=/var/log/mongodb/mongod.log
pidfilePath=/var/run/mongodb/mongod.pid

Start MongoDB with the following options YAML format (YAML config introduced from version 2.6) :

systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log
processManagement:
  pidFilePath: /var/run/mongodb/mongod.pid

The config file /etc/logrotate.d/mongod.conf for MongoDB’s log file rotation should be set up like this:

/var/log/mongodb/mongod.log {
  daily
  size 100M
  rotate 10
  missingok
  compress
  delaycompress
  notifempty
  create 640 mongod mongod
  sharedscripts
  postrotate
    /bin/kill -SIGUSR1 `cat /var/run/mongodb/mongod.pid 2>/dev/null` >/dev/null 2>&1
    find /var/log/mongodb -type f -size 0 -regextype posix-awk \
-regex "^\/var\/log\/mongodb\/mongod\.log\.[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}-[0-9]{2}-[0-9]{2}$" \
-execdir rm {} \; >/dev/null 2>&1
  endscript
}

In this case, the logrotate utility behaves as follows:

  • Check for the size, and start rotation if the log file size exceeds 100M
  • Move mongod.log file to mongod.log.1
  • Create a new mongod.log file with mongod permissions
  • MongoDB continues to write to the old file, mongod.log.1
  • In postrotate, when the SIGUSR1 signal is sent, mongod rotates the log file. This includes renaming the new mongod.log file (0 bytes) created by logrotate to mongod.log.xxxx-xx-xxTxx-xx-xx format and creating a new mongod.log file to which, now, mongod starts writing the logs.
  • the Linux command
    find

      identifies mongod.log.xxxx-xx-xxTxx-xx-xx file formats that are sized at 0 bytes, and these are removed

If you enjoyed this blog…

You might also benefit from this recorded webinar led by my colleague Tim Vaillancourt MongoDB Backup and Recovery Field Guide or perhaps the Percona Solution Brief Security for MongoDB.

References:
https://jira.mongodb.org/browse/SERVER-11087
https://jira.mongodb.org/browse/SERVER-14053
https://jira.mongodb.org/browse/SERVER-16821
https://docs.mongodb.com/v2.6/reference/configuration-options/
https://docs.mongodb.com/v2.4/reference/configuration-options/
https://docs.mongodb.com/manual/reference/configuration-options/

The post Automating MongoDB Log Rotation appeared first on Percona Database Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com