May
10
2023
--

Understanding Linux IOWait

IOWait

I have seen many Linux Performance engineers looking at the “IOWait” portion of CPU usage as something to indicate whenever the system is I/O-bound. In this blog post, I will explain why this approach is unreliable and what better indicators you can use.

Let’s start by running a little experiment – generating heavy I/O usage on the system:

sysbench  --threads=8 --time=0 --max-requests=0  fileio --file-num=1 --file-total-size=10G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run

 

CPU Usage in Percona Monitoring and Management (PMM):

CPU Usage in Percona Monitoring and Management

root@iotest:~# vmstat 10
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  6      0 7137152  26452 762972    0    0 40500  1714 2519 4693  1  6 55 35  3
 2  8      0 7138100  26476 762964    0    0 344971    17 20059 37865  3 13  7 73  5
 0  8      0 7139160  26500 763016    0    0 347448    37 20599 37935  4 17  5 72  3
 2  7      0 7139736  26524 762968    0    0 334730    14 19190 36256  3 15  4 71  6
 4  4      0 7139484  26536 762900    0    0 253995     6 15230 27934  2 11  6 77  4
 0  7      0 7139484  26536 762900    0    0 350854     6 20777 38345  2 13  3 77  5

So far, so good, and — we see I/O intensive workload clearly corresponds to high IOWait  (“wa” column in vmstat). 

Let’s continue running our I/O-bound workload and add a heavy CPU-bound load:

sysbench --threads=8 --time=0 cpu run

 

heavy CPU usage

root@iotest:~# vmstat 10
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
12  4      0 7121640  26832 763476    0    0 48034  1460 2895 5443  6  7 47 37  3
13  3      0 7120416  26856 763464    0    0 256464    14 12404 25937 69 15  0  0 16
 8  8      0 7121020  26880 763496    0    0 325789    16 15788 33383 85 15  0  0  0
10  6      0 7121464  26904 763460    0    0 322954    33 16025 33461 83 15  0  0  1
 9  7      0 7123592  26928 763524    0    0 336794    14 16772 34907 85 15  0  0  1
13  3      0 7124132  26940 763556    0    0 386384    10 17704 38679 84 16  0  0  0
 9  7      0 7128252  26964 763604    0    0 356198    13 16303 35275 84 15  0  0  0
 9  7      0 7128052  26988 763584    0    0 324723    14 13905 30898 80 15  0  0  5
10  6      0 7122020  27012 763584    0    0 380429    16 16770 37079 81 18  0  0  1

 

What happened?  IOWait is completely gone and now this system does not look I/O-bound at all!  

In reality, though, of course, nothing changed for our first workload — it continues to be I/O-bound; it just became invisible when we look at “IOWait”!

To understand what is happening, we really need to understand what “IOWait” is and how it is computed.

There is a good article that goes into more detail on the subject, but basically, “IOWait” is kind of idle CPU time. If the CPU core gets idle because there is no work to do, the time is accounted as “idle.”  If, however, it got idle because a process is waiting on disk, I/O time is counted towards “IOWait.”

However, if a process is waiting on disk I/O but other processes on the system can use the CPU, the time will be counted towards their CPU usage as user/system time instead. 

Because of this accounting, other interesting behaviors are possible.  Now instead of running eight I/O-bound threads, let’s just run one I/O-bound process on four core VM:

sysbench  --threads=1 --time=0 --max-requests=0  fileio --file-num=1 --file-total-size=10G --file-io-mode=sync --file-extra-flags=direct --file-test-mode=rndrd run

 

four core VM CPU usage

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  1      0 7130308  27704 763592    0    0 62000    12 4503 8577  3  5 69 20  3
 2  1      0 7127144  27728 763592    0    0 67098    14 4810 9253  2  5 70 20  2
 2  1      0 7128448  27752 763592    0    0 72760    15 5179 9946  2  5 72 20  1
 4  0      0 7133068  27776 763588    0    0 69566    29 4953 9562  2  5 72 21  1
 2  1      0 7131328  27800 763576    0    0 67501    15 4793 9276  2  5 72 20  1
 2  0      0 7128136  27824 763592    0    0 59461    15 4316 8272  2  5 71 20  3
 3  1      0 7129712  27848 763592    0    0 64139    13 4628 8854  2  5 70 20  3
 2  0      0 7128984  27872 763592    0    0 71027    18 5068 9718  2  6 71 20  1
 1  0      0 7128232  27884 763592    0    0 69779    12 4967 9549  2  5 71 20  1
 5  0      0 7128504  27908 763592    0    0 66419    18 4767 9139  2  5 71 20  1

 

Even though this process is completely I/O-bound, we can see IOWait (wa) is not particularly high, less than 25%. On larger systems with 32, 64, or more cores, such completely IO-bottlenecked processes will be all but invisible, generating single-digit IOWait percentages. 

As such, high IOWait shows many processes in the system waiting on disk I/O, but even with low IOWait, the disk I/O may be bottlenecked for some processes on the system.

If IOWait is unreliable, what can you use instead to give you better visibility? 

First, look at application-specific observability.  The application, if it is well instrumented, tends to know best whenever it is bound by the disk and what particular tasks are I/O-bound. 

If you only have access to Linux metrics, look at the “b” column in vmstat, which corresponds to processes blocked on disk I/O. This will show such processes, even of concurrent CPU-intensive loads, will mask IOWait:

CPU intensive load will mask IOWait

Finally, you can look at per-process statistics to see which processes are waiting for disk I/O. For Percona Monitoring and Management, you can install a plugin as described in the blog post Understanding Processes Running on Linux Host with Percona Monitoring and Management.

With this extension, we can clearly see which processes are runnable (running or blocked on CPU availability) and which are waiting on disk I/O!

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

 

Download Percona Monitoring and Management Today

Apr
17
2023
--

MongoDB Best Practices: Security, Data Modeling, & Schema Design

MongoDB best practices

In this blog post, we will discuss the best practices on the MongoDB ecosystem applied at the Operating System (OS) and MongoDB levels. We’ll also go over some best practices for MongoDB security as well as MongoDB data modeling. The main objective of this post is to share my experience over the past years tuning MongoDB and centralize the diverse sources that I crossed in this journey in a unique place.

Spoiler alert: This post focuses on MongoDB 3.6.X series and higher since previous versions have reached End-of-Life (EOL).

Note that the intent of tuning the settings is not exclusively about improving performance but also enhancing the high availability and resilience of the MongoDB database.

Without further ado, let’s start with the OS settings.

Operating System (OS) settings

Swappiness

Swappiness is a Linux kernel setting that influences the behavior of the Virtual Memory manager when it needs to allocate a swap, ranging from 0-100. A setting of “0” tells the kernel to swap only to avoid out-of-memory problems. A setting of 100 determines it to swap aggressively to disk. The Linux default is usually 60, which is not ideal for database usage.

It is common to see a value of “0″ (or sometimes “10”) on database servers, telling the kernel to prefer to swap to memory for better response times. However, Ovais Tariq details a known bug (or feature) when using a setting of “0”.

So it is recommended to set it to “1″. To change the swappiness value:
# Non persistent - the value will change to the previous value if you reboot the OS 
echo 1 > /proc/sys/vm/swappiness

And to persist it across reboots:

# Change in sysctl.conf will make the change persistent across reboots 
sudo sysctl -w vm.swappiness=1

NUMA architecture

Non-uniform memory access (NUMA) is a memory design where a  symmetric multiprocessing system processor(SMP) can access its local memory faster than non-local memory (the one assigned locally to other CPUs).  Here is an example of a system that has NUMA enabled:

As we can see, node 0 has more free memory than node 1. There is an issue with this, which causes the OS to swap even with memory available. The swap issue is explained in the excellent article by Jeremy Cole at the Swap Insanity and NUMA Architecture. The report focuses on MySQL, but it is valid for MongoDB as well.

 

Unfortunately, MongoDB is not NUMA-aware, and because of this, MongoDB can allocate memory unevenly, leading to the swap issue even with memory available. To solve this issue the mongod process can use the interleaved mode (fair memory allocation on all the nodes) in two ways:

Start the mongod process with numactl --interleave=all :

numactl --interleave=all /usr/bin/mongod -f /etc/mongod.conf

Or if systemd is in use:

# Edit the file
/etc/systemd/system/multi-user.target.wants/mongod.service

If the existing ExecStart statement reads:

ExecStart=/usr/bin/mongod --config /etc/mongod.conf

Update that statement to read:

ExecStart=/usr/bin/numactl --interleave=all /usr/bin/mongod --config /etc/mongod.conf

Apply the change to systemd:

sudo systemctl daemon-reload

Restart any running mongod instances:

sudo systemctl stop mongod
sudo systemctl start mongod

And to validate the memory usage:

$ sudo numastat -p $(pidof mongod)

Per-node process memory usage (in MBs) for PID 35172 (mongod)
                           Node 0          Node 1           Total
                  --------------- --------------- ---------------
Huge                         0.00            0.00            0.00
Heap                        19.40           27.36           46.77
Stack                        0.03            0.03            0.05
Private                      1.61           24.23           25.84
----------------  --------------- --------------- ---------------
Total                       21.04           51.62           72.66

zone_reclaim_mode

In some OS versions, the vm.zone_reclaim_mode is enabled. The zone_reclaim_mode parameter allows someone to set more or less aggressive approaches to reclaim memory when a zone runs out of memory. If it is set to zero, then no zone reclaim occurs.

It is necessary to disable vm.zone_reclaim_mode when NUMA is enabled.  To disable it, you can execute the following command:

sudo sysctl -w vm.zone_reclaim_mode=0

IO scheduler

The IO scheduler is an algorithm the kernel will use to commit reads and writes to disk. By default, most Linux installs use the CFQ (Completely-Fair Queue) scheduler. The CFQ works well for many general use cases but lacks latency guarantees. Two other schedulers are deadline and noop. The deadline excels at latency-sensitive use cases (like databases), and noop is closer to no schedule at all. For bare metals, any algorithm among deadline or noop  (the performance difference between them is imperceptible) will be better than CFQ.

Get database support for MongoDB

 

If you are running MongoDB inside a VM (which has its own IO scheduler beneath it), it is best to use “noop” and let the virtualization layer take care of the IO scheduling itself.

To change it, run these as root (accordingly to the disk):

# Verifying
$ cat /sys/block/xvda/queue/scheduler
noop [deadline] cfq

# Adjusting the value dynamically
$ echo "noop" > /sys/block/xvda/queue/scheduler

To make this change persistent, you must edit the GRUB configuration file (usually /etc/sysconfig/grub ) and add an elevator option to GRUB_CMDLINE_LINUX_DEFAULT .  For example, you would change this line:

GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200

With this line:

GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200 elevator=noop"

Note for AWS setups: There are cases where the I/O scheduler has a value of none, most notably in AWS VM instance types where EBS volumes are exposed as NVMe block devices. This is because the setting has no use in modern PCIe/NVMe devices. The reason is that they have a substantial internal queue, and they bypass the IO scheduler altogether. The setting, in this case, is none, and it is optimal in such disks.

 Transparent Huge Pages

Databases use small memory pages, and the Transparent Huge Pages tend to become fragmented and impact performance:

To disable it on the runtime for RHEL/CentOS 6 and 7:

$ echo "never" > /sys/kernel/mm/transparent_hugepage/enabled
$ echo "never" > /sys/kernel/mm/transparent_hugepage/defrag

To make this change survive a server restart, you’ll have to add the flag transparent_hugepage=never  to your kernel options (/etc/sysconfig/grub):

GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200 elevator=noop transparent_hugepage=never"

Rebuild the /boot/grub2/grub.cfg  file by running the grub2-mkconfig -o command. Before rebuilding the GRUB2 configuration file, ensure to take a backup of the existing /boot/grub2/grub.cfg.

On BIOS-based machines

$ grub2-mkconfig -o /boot/grub2/grub.cfg

Troubleshooting

If Transparent Huge Pages (THP) is still not disabled, continue and use the option below:

Disable tuned services

Disable the tuned services if it is re-enabling the THP using the below commands.

$ systemctl stop tuned
$ systemctl disable tuned

Dirty ratio

The dirty_ratio  is the percentage of total system memory that can hold dirty pages. The default on most Linux hosts is between 20-30%. When you exceed the limit, the dirty pages are committed to the disk, creating a small pause. To avoid the hard pause, there is a second ratio: dirty_background_ratio (default 10-15%) which tells the kernel to start flushing dirty pages to disk in the background without any pause.

20-30% is a good general default for “dirty_ratio,” but on large-memory database servers, this can be a lot of memory. For example, on a 128GB memory host, this can allow up to 38.4GB of dirty pages. The background ratio won’t kick in until 12.8GB. It is recommended to lower this setting and monitor the impact on query performance and disk IO. The goal is to reduce memory usage without impacting query performance negatively.

A recommended setting for dirty ratios on large-memory (64GB+) database servers is: vm.dirty_ratio = 15 and vm.dirty_background_ratio = 5, or possibly less. (Red Hat recommends lower ratios of 10 and 3 for high-performance/large-memory servers.)

You can set this by adding the following lines to the /etc/sysctl.conf:

vm.dirty_ratio = 15
vm.dirty_background_ratio = 5

Filesystems mount options

MongoDB recommends the XFS filesystem for on-disk database data. Furthermore, proper mount options can improve performance noticeably. Make sure the drives are mounted with noatime and also if the drives are behind a RAID controller with appropriate battery-backed cache. It is possible to remount on the fly; For example, in order to remount /mnt/db/ with these options:

mount -oremount,rw,noatime/mnt/db

It is necessary to add/edit the corresponding line in /etc/fstab for the option to persist on reboots. For example:

UUID=f41e390f-835b-4223-a9bb-9b45984ddf8d /                       xfs     rw,noatime,attr2,inode64,noquota        0 0

Unix ulimit settings

Most UNIX-like operating systems, including Linux and macOS, provide ways to limit and control the usage of system resources such as threads, files, and network connections on a per-process and per-user basis. These “ulimits” prevent single users from using too many system resources. Sometimes, these limits have low default values that can cause several issues during regular MongoDB operations. For Linux distributions that use systemd, you can specify limits within the [Service] :

First, open the file for editing:

vi /etc/systemd/system/multi-user.target.wants/mongod.service

And under [Service] add the following:

# (file size) 
LimitFSIZE=infinity 
# (cpu time) LimitCPU=infinity 
# (virtual memory size) 
LimitAS=infinity 
# (locked-in-memory size) 
LimitMEMLOCK=infinity 
# (open files) 
LimitNOFILE=64000 
# (processes/threads) 
LimitNPROC=64000

To adjust for the user:

# Edit the file below
/etc/security/limits.conf

And add to the user that is starting the mongod process:

# In this example, the user is mongo
mongo hard cpu  unlimited
mongo soft cpu  unlimited
mongo hard memlock unlimited
mongo soft memlock unlimited
mongo hard nofile 64000
mongo soft nofile 64000
mongo hard nproc 192276
mongo soft nproc 192276
mongo hard fsize unlimited
mongo soft fsize unlimited
mongo hard as unlimited
mongo soft as unlimited

To improve performance, we can safely set the limit of processes for the super-user root to be unlimited. Edit the .bashrc file and add the following line:

# vi /root/.bashrc
ulimit -u unlimited

Exit and re-login from the terminal for the change to take effect.

Network stack

Several defaults of the Linux kernel network tunings are either not optimal for MongoDB, limit a typical host with 1000mbps network interfaces (or better), or cause unpredictable behavior with routers and load balancers. I suggest increasing the relatively low throughput settings (net.core.somaxconn and net.ipv4.tcp_max_syn_backlog) and a decrease in keepalive settings, seen below.

Make these changes permanent by adding the following to /etc/sysctl.conf (or a new file /etc/sysctl.d/mongodb-sysctl.conf – if /etc/sysctl.d exists):

net.core.somaxconn = 4096
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_time = 120
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_keepalive_probes = 6

Note: you must run the command /sbin/sysctl -p as root/sudo (or reboot) to apply this change.

NTP daemon

All of these deeper tunings make it easy to forget about something as simple as your clock source. As MongoDB is a cluster, it relies on a consistent time across nodes. Thus the NTP Daemon should run permanently on all MongoDB hosts, mongos and arbiters included.

This is installed on RedHat/CentOS with the following:

sudo yum install ntp

MongoDB settings

Journal commit interval

Values can range from 1 to 500 milliseconds (default is 200 ms). Lower values increase the durability of the journal at the expense of disk performance. Since MongoDB usually works with a replica set, it is possible to increase this parameter to get better performance:

# edit /etc/mongod.conf
storage:
  journal:
    enabled: true
    commitIntervalMs: 300

WiredTiger cache

For dedicated servers, it is possible to increase the WiredTiger(WT) cache. By default, it uses 50% of the memory + 1 GB. Set the value to 60-70% and monitor the memory usage. For example, to set the WT cache to 50Gb:

# edit /etc/mongod.conf
wiredTiger:
   engineConfig:
      cacheSizeGB: 50

If there’s a monitoring tool in place, such as Percona Monitoring and Management (Get Percona Monitoring and Management for MongoDB), it is possible to monitor memory usage. For example:

Read/Write tickets

WiredTiger uses tickets to control the number of read/write operations simultaneously processed by the storage engine. The default value is 128 and works well for most cases, but in some cases case, the number of tickets is not enough. To adjust:

use admin
db.adminCommand( { setParameter: 1, wiredTigerConcurrentReadTransactions: 256 } )
db.adminCommand( { setParameter: 1, wiredTigerConcurrentWriteTransactions: 256 } )

https://docs.mongodb.com/manual/reference/parameters/#wiredtiger-parameters

To make persistent add to the Mongo configuration file:

# Two options below can be used for wiredTiger and inMemory storage engines
setParameter:
    wiredTigerConcurrentReadTransactions: 256
    wiredTigerConcurrentWriteTransactions: 256

And to estimate is necessary to observe the workload behavior. Again, PMM is suitable for this situation:

Note that sometimes increasing the level of parallelism might lead to an opposite effect than desired in an already loaded server. At this point, it might be necessary to reduce the number of tickets to the current number of CPUs/vCPUs available (if the server has 16 cores, set the read/write tickets for 16 each). This parameter needs to be extensively tested!

Pitfalls for mongos in containers

The mongos process is not cgroups aware, which means it can blow up the CPU usage creating tons of TaskExecutor threads. Secondly, grouping containers in Kubernetes Pods creates tons of mongos processes, resulting in additional overhead. We can also extend this for automation(using Ansible, for example), which in general, DevOps engineers tend to create a pool of mongos.

To avoid pool explosion, set the parameter taskExecutorPoolSize in the containerized mongos by running it with the following argument or setting this parameter in a configuration file: --setParameter taskExecutorPoolSize=X, where X is the number of CPU cores you assign to the container (for example, ‘CPU limits’ in Kubernetes or ‘cpuset/cpus’ in Docker). For example:

$ /opt/mongodb/4.0.6/bin/mongos --logpath /home/vinicius.grippa/data/data/mongos.log --port 37017 --configdb configRepl/localhost:37027,localhost:37028,localhost:37029 --keyFile /home/vinicius.grippa/data/data/keyfile --fork --setParameter taskExecutorPoolSize=1

Or using the configuration file:

setParameter:
     taskExecutorPoolSize: 1

Additional MongoDB Best Practices for Security and data modeling

MongoDB security: Enable authorization and authentication on your database from deployment

When MongoDB is initially deployed, it does not enable authentication or authorization by default, making data susceptible to unauthorized access and potential data leaks. However, by activating these security features, users must authenticate themselves before accessing the database, thus reducing the likelihood of unauthorized access and data breaches. Enabling authentication and authorization from the outset is crucial to safeguarding data and can help prevent security vulnerabilities from being exploited.

MongoDB security: Take regular MongoDB backups

Regular backups are essential for MongoDB, as they serve as a safeguard against potential data loss caused by various factors, such as system failures, human errors, or malicious attacks. MongoDB backup methods include full, incremental, and continuous backups. Percona Backup for MongoDB is a custom-built backup utility designed to cater to the needs of users who don’t want to pay for proprietary software but require a fully supported community backup tool capable of performing cluster-wide consistent backups in MongoDB.

Regular backups can significantly reduce the potential downtime and losses resulting from data loss, enabling organizations to restore their data promptly after a failure. Moreover, backups play a crucial role in data security by mitigating the risks of data breaches and other vulnerabilities. By restoring the database to its previous state using backups, organizations can minimize the impact of an attack and quickly resume their operations.

MongoDB security: Monitor MongoDB performance regularly

As with any database, it’s essential to monitor MongoDB performance to ensure it’s running with optimal efficiency and effectiveness. By monitoring MongoDB, database administrators can get insights into potential bottlenecks, pinpoint sluggish queries, and improve application performance, as well as proactively address issues before they become critical, reducing downtime and ensuring a better user experience.

MongoDB databases that are left unmonitored are susceptible to various security threats, including, but not limited to, unauthorized access, data breaches, and data loss. If, for instance, an unsecured MongoDB instance is infiltrated by a malicious attacker, they may be able to steal sensitive information or even completely delete the data. Additionally, unmonitored databases can lead to security vulnerabilities that may be exploited.

In addition, unmonitored databases can also result in non-compliance with regulatory standards such as the General Data Protection Regulation (GDPR) or the California Consumer Privacy Act (CCPA). These regulations require organizations to protect user data and take measures to ensure data privacy.

Percona Monitoring and Management (PMM) is an open source database observability, monitoring, and management tool that provides actionable performance data for MongoDB variants, including Percona Server for MongoDB, MongoDB Community, and MongoDB Enterprise. Learn how you can Improve MongoDB Performance with Percona Monitoring and Management.

MongoDB data modeling Best Practices

MongoDB data modeling: Understand schema differences from relational databases

The traditional approach in designing a database schema in relational databases involves creating tables that store data in a tabular format. In this approach, each table represents an entity, and the relationships between the entities are defined through foreign key constraints. This schema design approach ensures data consistency and enables complex queries involving multiple tables, but it can be rigid and may pose scaling challenges as data volume increases.

On the other hand, MongoDB schema design takes a document-oriented approach. Instead of storing data in tables, data is stored in documents that can be nested to represent complex relationships, providing more flexibility in data modeling. This approach enables faster queries, as related data can be retrieved with a single query. However, it can lead to the storage of redundant data and make it more difficult to maintain data consistency.

Deciding between these two schema design approaches depends on the specific requirements of the application. For applications with complex relationships between entities, a relational database may be more suitable. However, for applications that require more flexibility in data modeling and faster query performance, MongoDB’s document-oriented approach may be the better option.

MongoDB data modeling: Understand embedding vs. referencing data

MongoDB offers two strategies for storing related data: embedding and referencing. Embedding involves storing associated data within a single document while referencing involves storing related data in separate documents and using a unique identifier to link them.

The key advantage of utilizing embedding is that it minimizes the necessity for multiple queries to retrieve associated data. Since all of the related data is stored in one document, a query for that particular document will enable the retrieval of all associated data simultaneously.

Embedding further simplifies database operations, reduces network overhead, and boosts read and query performance. However, it is worth noting that embedding may lead to the creation of large documents, which can negatively impact write performance and consume more memory.

Alternatively, referencing provides greater flexibility in querying and updating related data. It results in the creation of smaller documents, which can benefit write performance and memory usage. Still, it does require multiple queries to retrieve related data, which can affect read performance.

MongoDB data modeling: Use replication or sharding when scaling

Replication and sharding are techniques used to enhance performance and ensure the high availability of MongoDB databases.

Replication in MongoDB involves the utilization of replica sets, with one functioning as the primary and the others as secondary. The primary instance receives write operations, while the secondary instances replicate the data from the primary and can also be used for read operations. In case the primary instance fails, one of the secondary instances becomes the primary instance, providing redundancy and guaranteeing database high availability.

Sharding is the process of partitioning data across multiple servers in a cluster. In MongoDB, sharding is achieved by creating shards, each of which contains a subset of the data, which are then distributed across multiple machines in a cluster, with each machine hosting one or more shards. This allows MongoDB to scale horizontally, handling large datasets and high traffic loads.

Replication and sharding are good options to consider when your MongoDB database becomes slow. Replication is particularly useful for read-heavy workloads, while sharding is more suited to write-heavy workloads. When dealing with both read and write-heavy workloads, a combination of the two may be the best option, depending on the specific needs and demands of your application(s).

 

Ensure data availability for your applications with Percona Distribution for MongoDB

 

Stay on top of MongoDB Best Practices with Percona Monitoring and Management

I tried to cover and summarize the most common questions and incorrect settings that I see in daily activities. Using recommended settings is not a silver bullet, but it will cover the majority of the cases and will help have a better user experience for those who use MongoDB. Finally, having a proper monitoring system in place must be a priority to adjust its settings according to the application workload.

Percona’s open source database monitoring tool monitors the health of your database infrastructure and helps you improve MongoDB performance. Download today.

 

Download Percona Monitoring and Management

Useful Resources

Finally, you can reach us through social networks, our forum, or access our material using the links presented below:

May
10
2022
--

Spring Cleaning: Discontinuing RHEL 6/CentOS 6 (glibc 2.12) and 32-bit Binary Builds of Percona Software

Discontinuing RHEL 6/CentOS 6

Discontinuing RHEL 6/CentOS 6As you are probably aware, Red Hat Enterprise Linux 6 (RHEL 6 or EL 6 in short) officially reached “End of Life” (EOL) on 2020-11-30 and is now in the so-called Extended Life Phase, which basically means that Red Hat will no longer provide bug fixes or security fixes.

Even though EL 6 and its compatible derivatives like CentOS 6 had reached EOL some time ago already, we continued providing binary builds for selected MySQL-related products for this platform.

However, this became increasingly difficult, as the MySQL code base continued to evolve and now depends on tools and functionality that are no longer provided by the operating system out of the box. This meant we already had to perform several modifications in order to prepare binary builds for this platform, e.g. installing custom compiler versions or newer versions of various system libraries.

As of MySQL 8.0.26, Oracle announced that they deprecated the TLSv1 and TLSv1.1 connection protocols and plan to remove these in a future MySQL version in favor of the more secure TLSv1.2 and TLSv1.3 protocols. TLSv1.3 requires that both the MySQL server and the client application be compiled with OpenSSL 1.1.1 or higher. This version of OpenSSL is not available in binary package format on EL 6 anymore, and manually rebuilding it turned out to be a “yak shaving exercise” due to the countless dependencies.

Our build & release team was able to update the build environments on all of our supported platforms (EL 7, EL 8, supported Debian and Ubuntu versions) for this new requirement. However, we have not been successful in getting all the required components and their dependencies to build on EL 6, as it would have required rebuilding quite a significant amount of core OS packages and libraries to achieve this.

Moreover, switching to this new OpenSSL version would have also required us to include some additional shared libraries in our packages to satisfy the runtime dependencies, adding more complexity and potential security issues.

In general, we believe that running a production system on an OS that is no longer actively supported by a vendor is not a recommended best practice from a security perspective, and we do not want to encourage such practices.

Because of these reasons and to simplify our build/release and QA processes, we decided to drop support for EL 6 for all products now. Percona Server for MySQL 8.0.27 was the last version for which we built binaries for EL 6 against the previous version of OpenSSL.

Going forward, the following products will no longer be built and released on this platform:

  • Percona Server for MySQL 5.7 and 8.0
  • Percona XtraDB Cluster 5.7
  • Percona XtraBackup 2.4 and 8.0
  • Percona Toolkit 3.2

This includes stopping both building RPM packages for EL 6 and providing binary tarballs that are linked against glibc 2.12.

Note that this OS platform was also the last one on which we still provided 32-bit binaries.

Most of the Enterprise Linux distributions have stopped providing 32-bit versions of their operating systems quite some time ago already. As an example, Red Hat Enterprise Linux 7 (released in June 2014) was the first release to no longer support installing directly on 32-bit Intel/AMD hardware (i686/x86). Already back in 2018, we had taken the decision that we will no longer be offering 32-bit binaries on new platforms or new major releases of our software.

Given today’s database workloads, we also think that 32-bit systems are simply not adequate anymore, and we already stopped building newer versions of our software for this architecture.

The demand for 32-bit downloads has also been declining steadily. A recent analysis of our download statistics revealed that only 2.3% of our total binary downloads are referring to i386 binaries. Looking at IP addresses, these downloads originated from 0.4% of the total range of addresses.

This change affects the following products:

  • Percona Server for MySQL 5.7
  • Percona XtraDB Cluster 5.7
  • Percona XtraBackup 2.4
  • Percona Toolkit

We’ve updated the Percona Release Lifecycle Overview web page accordingly to reflect this change. Previously released binaries for these platforms and architectures will of course remain accessible from our repositories.

If you’re still running EL 6 or a 32-bit database or OS, we strongly recommend upgrading to a more modern platform. Our Percona Services team would be happy to help you with that!

Aug
27
2021
--

Linux 5.14 set to boost future enterprise application security

Linux is set for a big release this Sunday August 29, setting the stage for enterprise and cloud applications for months to come. The 5.14 kernel update will include security and performance improvements.

A particular area of interest for both enterprise and cloud users is always security and to that end, Linux 5.14 will help with several new capabilities. Mike McGrath, vice president, Linux Engineering at Red Hat told TechCrunch that the kernel update includes a feature known as core scheduling, which is intended to help mitigate processor-level vulnerabilities like Spectre and Meltdown, which first surfaced in 2018. One of the ways that Linux users have had to mitigate those vulnerabilities is by disabling hyper-threading on CPUs and therefore taking a performance hit. 

“More specifically, the feature helps to split trusted and untrusted tasks so that they don’t share a core, limiting the overall threat surface while keeping cloud-scale performance relatively unchanged,” McGrath explained.

Another area of security innovation in Linux 5.14 is a feature that has been in development for over a year-and-a-half that will help to protect system memory in a better way than before. Attacks against Linux and other operating systems often target memory as a primary attack surface to exploit. With the new kernel, there is a capability known as memfd_secret () that will enable an application running on a Linux system to create a memory range that is inaccessible to anyone else, including the kernel.

“This means cryptographic keys, sensitive data and other secrets can be stored there to limit exposure to other users or system activities,” McGrath said.

At the heart of the open source Linux operating system that powers much of the cloud and enterprise application delivery is what is known as the Linux kernel. The kernel is the component that provides the core functionality for system operations. 

The Linux 5.14 kernel release has gone through seven release candidates over the last two months and benefits from the contributions of 1,650 different developers. Those that contribute to Linux kernel development include individual contributors, as well large vendors like Intel, AMD, IBM, Oracle and Samsung. One of the largest contributors to any given Linux kernel release is IBM’s Red Hat business unit. IBM acquired Red Hat for $34 billion in a deal that closed in 2019.

“As with pretty much every kernel release, we see some very innovative capabilities in 5.14,” McGrath said.

While Linux 5.14 will be out soon, it often takes time until it is adopted inside of enterprise releases. McGrath said that Linux 5.14 will first appear in Red Hat’s Fedora community Linux distribution and will be a part of the future Red Hat Enterprise Linux 9 release. Gerald Pfeifer, CTO for enterprise Linux vendor SUSE, told TechCrunch that his company’s openSUSE Tumbleweed community release will likely include the Linux 5.14 kernel within ‘days’ of the official release. On the enterprise side, he noted that SUSE Linux Enterprise 15 SP4, due next spring, is scheduled to come with Kernel 5.14. 

The new Linux update follows a major milestone for the open source operating system, as it was 30 years ago this past Wednesday that creator Linus Torvalds (pictured above) first publicly announced the effort. Over that time Linux has gone from being a hobbyist effort to powering the infrastructure of the internet.

McGrath commented that Linux is already the backbone for the modern cloud and Red Hat is also excited about how Linux will be the backbone for edge computing – not just within telecommunications, but broadly across all industries, from manufacturing and healthcare to entertainment and service providers, in the years to come.

The longevity and continued importance of Linux for the next 30 years is assured in Pfeifer’s view.  He noted that over the decades Linux and open source have opened up unprecedented potential for innovation, coupled with openness and independence.

“Will Linux, the kernel, still be the leader in 30 years? I don’t know. Will it be relevant? Absolutely,” he said. “Many of the approaches we have created and developed will still be pillars of technological progress 30 years from now. Of that I am certain.”

 

 

Aug
17
2021
--

First Packages for Debian 11 “bullseye” Now Available

Percona Debian 11 bullseye

Percona Debian 11 bullseyeOver the weekend, the Debian project announced the availability of their newest major distribution release, Debian 11 (code name “bullseye”). We’d like to congratulate the Debian project and the open source community for achieving this major milestone! With over two years in the making, it contains an impressive amount of new and updated software for a wide range of applications (check out the release notes for details). The project’s emphasis on providing a stable Linux operating system makes Debian Linux a preferred choice for database workloads.

The packaging, release, and QA teams here at Percona have been working on adding support for Debian 11 to our products for quite some time already.

This week, we’ve released Percona Server for MongoDB 4.4.8 and 5.0.2 (Release Candidate) as well as Percona Backup for MongoDB 1.6.0, including packages for Debian 11 as a new supported OS platform. Please follow the installation instructions in the respective product documentation to install these versions.

We’ve also rebuilt a number of previously released products on Debian 11. At this point, the following products are available for download from our “testing” package repositories:

  • Percona Server for MySQL 5.7 and 8.0
  • Percona XtraDB Cluster 5.7 and 8.0
  • Percona XtraBackup 2.4 and 8.0

As usual, you can use the percona-release tool to enable the testing repository for these products. Please follow the installation instructions on how to install the tool and proceed.

As an example, if you’d like to install the latest version of Percona Server for MySQL 8.0 on Debian 11, perform the following steps after completing the installation of the base operating system and installing the percona-release tool:

$ sudo percona-release enable ps-80 testing
$ sudo apt update
$ sudo apt install percona-server-server percona-server-client

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

Download Percona Distribution for MongoDB Today!

Aug
10
2021
--

VCs are betting big on Kubernetes: Here are 5 reasons why

I worked at Google for six years. Internally, you have no choice — you must use Kubernetes if you are deploying microservices and containers (it’s actually not called Kubernetes inside of Google; it’s called Borg). But what was once solely an internal project at Google has since been open-sourced and has become one of the most talked about technologies in software development and operations.

For good reason. One person with a laptop can now accomplish what used to take a large team of engineers. At times, Kubernetes can feel like a superpower, but with all of the benefits of scalability and agility comes immense complexity. The truth is, very few software developers truly understand how Kubernetes works under the hood.

I like to use the analogy of a watch. From the user’s perspective, it’s very straightforward until it breaks. To actually fix a broken watch requires expertise most people simply do not have — and I promise you, Kubernetes is much more complex than your watch.

How are most teams solving this problem? The truth is, many of them aren’t. They often adopt Kubernetes as part of their digital transformation only to find out it’s much more complex than they expected. Then they have to hire more engineers and experts to manage it, which in a way defeats its purpose.

Where you see containers, you see Kubernetes to help with orchestration. According to Datadog’s most recent report about container adoption, nearly 90% of all containers are orchestrated.

All of this means there is a great opportunity for DevOps startups to come in and address the different pain points within the Kubernetes ecosystem. This technology isn’t going anywhere, so any platform or tooling that helps make it more secure, simple to use and easy to troubleshoot will be well appreciated by the software development community.

In that sense, there’s never been a better time for VCs to invest in this ecosystem. It’s my belief that Kubernetes is becoming the new Linux: 96.4% of the top million web servers’ operating systems are Linux. Similarly, Kubernetes is trending to become the de facto operating system for modern, cloud-native applications. It is already the most popular open-source project within the Cloud Native Computing Foundation (CNCF), with 91% of respondents using it — a steady increase from 78% in 2019 and 58% in 2018.

While the technology is proven and adoption is skyrocketing, there are still some fundamental challenges that will undoubtedly be solved by third-party solutions. Let’s go deeper and look at five reasons why we’ll see a surge of startups in this space.

 

Containers are the go-to method for building modern apps

Docker revolutionized how developers build and ship applications. Container technology has made it easier to move applications and workloads between clouds. It also provides as much resource isolation as a traditional hypervisor, but with considerable opportunities to improve agility, efficiency and speed.

Jul
01
2021
--

Installing Percona Server for MySQL on Rocky Linux 8

MySQL on Rocky Linux 8

MySQL on Rocky Linux 8With the CentOS project switching its focus to CentOS Stream, one of the alternatives that aim to function as a downstream build (building and releasing packages after they’re released by Red Hat) is Rocky Linux. This how-to shows how to install Percona Server for MySQL 8.0 on the Rocky Linux distribution.

You can get the information on the distribution release version by checking the /etc/redhat-release file:

[root@rocky ~]# cat /etc/redhat-release
Rocky Linux release 8.4 (Green Obsidian)

Installing and Setting up the Percona Server for MySQL 8.0  Repository

Downloading and Installing the percona-release repository package for Red Hat Linux and derivatives:

[root@rocky ~]# yum install -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm

This should result in:

…

Verifying        : percona-release-1.0-26.noarch                                                                                                                                                                                                                                   1/1
Installed:

  percona-release-1.0-26.noarch

Complete!

Once the repository package is installed, you should set up the Percona Server for MySQL 8.0 repository by running:

[root@rocky ~]#  percona-release setup ps80

Please note that you’ll be prompted to disable the mysql module to install Percona Server packages:

* Disabling all Percona Repositories

On RedHat 8 systems it is needed to disable dnf mysql module to install Percona-Server

Do you want to disable it? [y/N] y

Disabling dnf module...

Percona Release release/noarch YUM repository    6.3 kB/s | 1.6 kB     00:00

Dependencies resolved.

=============================================================================

Package        Architecture       Version      Repository           Size

=============================================================================
Disabling modules:

mysql




Transaction Summary

==============================================================================
Complete!

dnf mysql module was disabled

* Enabling the Percona Server 8.0 repository

* Enabling the Percona Tools repository

<*> All done!

Installing and Setting up the Percona Server for MySQL 8.0 Binaries

This part is also covered in the Percona Server for MySQL documentation.

1. Installing the latest Percona Server 8.0 binaries:

[root@rocky ~]# yum -y install percona-server-server

This will also install all the required dependencies:

Installed:

compat-openssl10-1:1.0.2o-3.el8.x86_64           
libaio-0.3.112-1.el8.x86_64           
percona-server-client-8.0.23-14.1.el8.x86_64           
percona-server-server-8.0.23-14.1.el8.x86_64           
percona-server-shared-8.0.23-14.1.el8.x86_64           
percona-server-shared-compat-8.0.23-14.1.el8.x86_64

Complete!

2. After installation is done, you can start the mysqld service:

[root@rocky ~]# systemctl start mysqld

3. Once the service is running you can check the status by running:

[root@rocky ~]# systemctl status mysqld

You should get similar output to:

? mysqld.service - MySQL Server   
Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)

 Active: active (running) since Mon 2021-06-28 10:23:22 UTC; 6s ago
 Docs: man:mysqld(8)

http://dev.mysql.com/doc/refman/en/using-systemd.html
Process: 37616 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
Main PID: 37698 (mysqld)
Status: "Server is operational"
Tasks: 39 (limit: 23393)
Memory: 450.7M
CGroup: /system.slice/mysqld.service
??37698 /usr/sbin/mysqld

Jun 28 10:23:12 rocky systemd[1]: Starting MySQL Server...
Jun 28 10:23:22 rocky systemd[1]: Started MySQL Server

From this process, we can see that the installation on RockyLinux is the same as installing Percona Server for MySQL on CentOS/Red Hat.

Apr
22
2021
--

Understanding Processes Running on Linux Host with Percona Monitoring and Management

processes linux host percona monitoring

processes linux host percona monitoringA few years ago, I wrote about how to add information about processes to your Percona Monitoring and Management (PMM) instance as well as some helpful ways you can use this data.

Since that time, PMM has released a new major version (PMM v2) and the Process Exporter went through many changes, so it’s time to provide some updated instructions.

Why Bother?

Why do you need per-process data for your database hosts, to begin with? I find this data very helpful, as it allows us to validate how much activity and load is caused by the database process rather than something else. This “something else” may range from a backup process that takes too much CPU, some usually benign system process that went crazy today, or it might even be a crypto miner which was “helpfully” installed on your system. Simply assuming all load you’re observing on the system comes from the database process – which may be correct in most cases, but can also lead you astray –  you need to be able to verify that.

Installation 

Process-monitoring awesomeness installation consists of two parts.  You install an exporter on every node on which you want to monitor process information, and then you install a dashboard onto your PMM server to visualize this data. External  Exporter Support was added in PMM 2.15, so you will need at least this version for those commands to work.

Installing The Exporter

The commands below will download and install the Prometheus Process Exporter and configure PMM to consume the data generated from it.

wget https://github.com/ncabatoff/process-exporter/releases/download/v0.7.5/process-exporter_0.7.5_linux_amd64.deb
dpkg -i process-exporter_0.7.5_linux_amd64.deb
service process-exporter start
pmm-admin add external --group=processes  --listen-port=9256

Note: Different versions of Process Exporter may also work, but this particular version is what I tested with.

Installing the Dashboard

The easiest way to install a dashboard is from the Grafana.com Dashboard Library. In your Percona Monitoring and Management install, click the “+” sign in the toolbar on the left side and select “Import”.

 

percona monitoring and management grafana dashboard

 

Enter Dashboard ID 14239 and you are good to go.

If you’re looking for ways to automate this import process as you are provisioning PMM automatically, you can do that too. Just follow the instructions in the Automate PMM Dashboard Importing blog post.

Understanding Processes Running on your Linux Machine

Let’s now move to the most fun part, looking at the available dashboards and what they can tell us about the running system and how they can help with diagnostics and troubleshooting. In the new dashboard, which I updated from an older PMMv1 version, I decided to add relevant whole-system metrics which can help us to put the process metrics in proper context.

 

node processes percona monitoring and management

 

The CPU-focused row shows us how system CPU is used overall and to what extent the system or some CPU cores are overloaded, as well as top consumers of the “User” and “System” CPU Modes.

Note, because of additional MetricsQL functionality provided by VictoriaMetrics, we can show [other] as the total resource usage by processes that did not make it to the top.

How do you use this data?  Check if the processes using CPU resources are those which you would expect or if there are any processes that you did not expect to see taking as much CPU as they actually do.

 

pmm memory

 

Memory Utilization does the same, but for memory. There are a number of different memory metrics which can be a bit intimidating.

Resident Memory means the memory process (or technically group of processes) takes in physical RAM.  The “Proportional” means the method by how this consumption is counted. A single page in RAM sometimes is shared by multiple processes, and Proportional means it is divided up among all processes sharing it when memory allocation is accounted for rather than counted as part of every process. This ensures there is no double counting and you should not see total size of Resident memory for your processes well in excess of the physical memory you have.

Register for Percona Live ONLINE
A Virtual Event about Open Source Databases

Used Memory means the space process consumes in RAM plus space it consumes in the swap space. Note, this metric is different from Virtual Memory, which also includes virtual space which was assigned to the process but never really allocated.

I find these two metrics as the most practical to understand how physical and virtual memory is actually used by the processes on the system.

 

resident and used memory

 

Virtual Memory is the virtual address space that was allocated to process. In some cases, it will be close to memory used as in the case of the mysqld process, and in other cases, it may be very different; like dockerd process which is running on this system takes 5GB of virtual memory and less than 70MB of actual memory used.

Swapped Memory shows us which processes are swapped out and by how much.  I would pay special attention to this graph because if the Swap Activity panel shows serious IO going on, this means system performance might be significantly impacted. If unused processes are swapped out, or even some unused portions of the processes, it is not the problem. However, if you have half of MySQL’s buffer pool swapped out and heavy Swap IO going… you have work to do.

 

Process Disk IO Usage

 

Process Disk IO Usage allows seeing IO bandwidth and latency for the system overall as well as bandwidth used by reads and writes by different processes.  If you have any unexpected disk IO bandwidth, consumers will easily spot them using this dashboard.

 

processes Context Switches

 

Context Switches provide more details on what kind of context switches are happening in the system and what processes they correspond to.

A high number of Voluntary Context Switches (hundreds of thousands and millions per second) may indicate heavy contention, or it may just correspond to a high number of requests being served by the process, as in many architectures starting/stopping request handling requires a context switch.

A high number of Non-Voluntary Context Switches, on the other hand, can correspond to not having enough CPU available with processes moved off CPU by the scheduler when they have exceeded their allotted time slice, or for other reasons.

 

global file descriptors

 

File Descriptors show us the global limit of the file descriptors allowed in the operating system as well as for individual processes.  Running out of file descriptors for a whole system is really bad, as you will have many things start failing at random. Although on modern, powerful systems, the limit is so high you rarely hit this problem.

The limit of files process can open still applies so it is very helpful to see which processes require a lot of file descriptors (by number) and also how this number compares to the total number of descriptors allowed for the process.  In our case, we can see no process ever allocated more than 7% of file descriptors allowed, which is quite healthy.

 

Major and Minor page faults

 

This graph shows Major and Minor page faults for given processes.

Major page faults are relatively expensive, typically causing disk IO when they happen.

Minor page faults are less expensive, corresponding to accessing pages that are not mapped to the given process address space but otherwise are in memory.  Minor page faults are less expensive, requiring a switch to kernel mode and for the kernel to do some housekeeping.

See more details on Minor/Major page faults and general Linux Memory Management here.

 

Processes in Linux

 

Processes in Linux can cycle through different statuses; for us, the most important ones to consider are “Active” statuses which are either “Running” or “Waiting on Disk IO”.  These roughly can be seen as using CPU and Disk IO resources.

In this section, we can see an overview of the number of running and waiting processes in the system (basically the same stuff “r” and “b” columns in vmstat show), as well as more detailed stats showing which processes, in particular, were running… or waiting on disk IO.

 

process kernel waits

 

While we can see what is going on with Active Processes by looking at their statuses, this shows us what is going on with sleeping processes.  In particular, what kernel functions are they sleeping in.  We can see data grouped by the name of the function in which wait happens or by pair function – process name.

If you want to focus on what types of kernel functions a given process is waiting on, you can select it in the dashboard dropdown to filter data just by this process. For example, selecting “mysqld”, I see:

 

kernel wait details

 

Finally, we have the panel which shows the processes based on their uptime.

 

processes uptime

 

This can be helpful to spot if any processes were started recently. Frankly, I do not find this panel to be most useful but as Process Exporter captures this data, why not?

Summary

Process Exporter provides great insights on running processes, in addition to what basic PMM installation provides.  Please check it out and let us know how helpful it is in your environment.  Should we consider enabling it by default in Percona Monitoring and Management?

Apr
15
2021
--

Platform End of Support Announcement for Ubuntu 16.04 LTS

EOL Ubuntu 16.04

EOL Ubuntu 16.04The End Of Support date for Ubuntu 16.04 LTS is coming soon. According to the Ubuntu Release Life Cycle, it will be at the end of April 2021. With this announcement comes some implications to support for Percona software running on these operating systems.

So we will no longer be producing new packages and binary builds for Ubuntu 16.04.

We generally align our platform end of life/support dates with those of the upstream platform vendor. The platform end of life/support dates are published in advance on our website on the  Percona Software support life cycle page

According to our policies, Percona will continue to provide operational support for your databases on Ubuntu 16.04. However, we will be unable to provide any bug fixes, builds, or OS-level assistance if you encounter an issue outside the database itself.

Each platform vendor has a supported migration or upgrade path to their next major release. Please reach out to us if you need assistance in migrating your database to your vendor’s supported platform – Percona will be happy to assist you.

Apr
06
2021
--

Esri brings its flagship ArcGIS platform to Kubernetes

Esri, the geographic information system (GIS), mapping and spatial analytics company, is hosting its (virtual) developer summit today. Unsurprisingly, it is making a couple of major announcements at the event that range from a new design system and improved JavaScript APIs to support for running ArcGIS Enterprise in containers on Kubernetes.

The Kubernetes project was a major undertaking for the company, Esri Product Managers Trevor Seaton and Philip Heede told me. Traditionally, like so many similar products, ArcGIS was architected to be installed on physical boxes, virtual machines or cloud-hosted VMs. And while it doesn’t really matter to end-users where the software runs, containerizing the application means that it is far easier for businesses to scale their systems up or down as needed.

Esri ArcGIS Enterprise on Kubernetes deployment

Esri ArcGIS Enterprise on Kubernetes deployment. Image Credits: Esri

“We have a lot of customers — especially some of the larger customers — that run very complex questions,” Seaton explained. “And sometimes it’s unpredictable. They might be responding to seasonal events or business events or economic events, and they need to understand not only what’s going on in the world, but also respond to their many users from outside the organization coming in and asking questions of the systems that they put in place using ArcGIS. And that unpredictable demand is one of the key benefits of Kubernetes.”

Deploying Esri ArcGIS Enterprise on Kubernetes

Deploying Esri ArcGIS Enterprise on Kubernetes. Image Credits: Esri

The team could have chosen to go the easy route and put a wrapper around its existing tools to containerize them and call it a day, but as Seaton noted, Esri used this opportunity to re-architect its tools and break it down into microservices.

“It’s taken us a while because we took three or four big applications that together make up [ArcGIS] Enterprise,” he said. “And we broke those apart into a much larger set of microservices. That allows us to containerize specific services and add a lot of high availability and resilience to the system without adding a lot of complexity for the administrators — in fact, we’re reducing the complexity as we do that and all of that gets installed in one single deployment script.”

While Kubernetes simplifies a lot of the management experience, a lot of companies that use ArcGIS aren’t yet familiar with it. And as Seaton and Heede noted, the company isn’t forcing anyone onto this platform. It will continue to support Windows and Linux just like before. Heede also stressed that it’s still unusual — especially in this industry — to see a complex, fully integrated system like ArcGIS being delivered in the form of microservices and multiple containers that its customers then run on their own infrastructure.

Image Credits: Esri

In addition to the Kubernetes announcement, Esri also today announced new JavaScript APIs that make it easier for developers to create applications that bring together Esri’s server-side technology and the scalability of doing much of the analysis on the client-side. Back in the day, Esri would support tools like Microsoft’s Silverlight and Adobe/Apache Flex for building rich web-based applications. “Now, we’re really focusing on a single web development technology and the toolset around that,” Esri product manager Julie Powell told me.

A bit later this month, Esri also plans to launch its new design system to make it easier and faster for developers to create clean and consistent user interfaces. This design system will launch April 22, but the company already provided a bit of a teaser today. As Powell noted, the challenge for Esri is that its design system has to help the company’s partners put their own style and branding on top of the maps and data they get from the ArcGIS ecosystem.

 

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com