Can’t Create Thread: Errno 11 (A Tale of Two Ulimits)

Recently some of my fellow Perconians and I have noticed a bit of an uptick in customer cases featuring the following error message:

SQLSTATE[HY000] [1135] Can't create a new thread (errno 11); if you are not 
out of available memory, you can consult the manual for a possible OS-dependent bug.

The canonical solution to this issue, if you do a bit of Googling, is to increase the number of processes / threads available to the MySQL user, typically by adding a line like this to /etc/security/limits.conf:

mysql  soft  nproc  4096

followed up by a restart of MySQL in a fresh user session. If you’re running RHEL/CentOS 5, Ubuntu, or versions of Fedora prior to Fedora 9, this solution will most likely solve your problem. But with RHEL/CentOS 6 (and derivatives) and Fedora 9 and later, you’ll likely find that this tweak no longer works.

In Fedora 9 and RHEL 6, a “bug fix” was introduced which was ostensibly intended to make forkbombing the system more difficult. [If you’re interested in the actual RedHat Bugzilla entry, you can read more about it here.] The result of this so-called “fix” was the introduction of a new PAM limits file, 90-nproc.conf, which appeared in the /etc/security/limits.d folder and contained the following:

# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

*          soft    nproc     1024

Just looking at this file, it seems innocuous enough, except that there’s one ever-so-small problem (aside from even knowing to look for this file in the first place!)

The 90-nproc.conf file OVERRIDES any process/thread limit changes that you may have made to /etc/security/limits.conf!

So even if you’ve properly adjusted /etc/security/limits.conf and set an explicit limit for nproc greater than 1024 for the “mysql” user, those limits will not be honored because of the existence of this file. If you actually want to increase the thread/process limit for the mysqld process under RHEL 6+ / Fedora 9+, you’ll actually need to do one of the following:

  1. Put your nproc changes directly in this file.
  2. Delete the file entirely (although it might get replaced by a future PAM upgrade).
  3. Add a new file in /etc/security/limits.d with a higher number (e.g., 91-mysql.conf) and set your limits in there – this is the solution that I’d recommend.

But wait, as they say during late-night television infomercials, there’s more!

One thing that people often forget when it comes to adjusting nproc/nofile limits in /etc/security/limits.conf or even this new 90-nproc.conf file is that these files are used ONLY to control the behavior of the PAM user limits module, pam_limits.so, which is only applicable to USER sessions. The implication here is that instances of services that are started at boot time by SysV initscripts are not affected by changes to these limits files.[1] Conversely, it also means that once you restart a process from within a user session, that service then becomes subject to any PAM-based limits. The end result is a sort of Heisenbug, wherein the actual nproc limits that apply to your MySQL instance are contingent upon how/when it was launched, and if you don’t restart your MySQL server very often, this issue could go undetected for a long time.

To illustrate, we take a standard CentOS 6.3 server that has had no changes of any kind made to /etc/security/limits.conf or /etc/security/limits.d/90-nproc.conf – it’s just a standard default installation. If we do a fresh reboot of the machine, allow MySQL to start up as a normal part of the boot process, and then check the process limits, we find that “Max processes” appears to have plenty of headroom. If max_connections is set to 1200, we should have no problem accepting 1200 simultaneous connections without running into a “can’t create thread” error at 1024.

(root@revolution 23:53:20)# cat /proc/`pidof mysqld`/limits | egrep "(processes|files)"
Max processes      22888       22888       processes 
Max open files     6000        6000        files

But let’s say that a new version of MySQL comes out and we need to do an upgrade. Odds are that we’re not going to reboot the whole machine to do this upgrade; we’ll just install the fresh RPMs and restart mysqld. Or perhaps we had some other cause to restart the MySQL server; the actual reason doesn’t matter. But now take a look at what’s happened to the process/thread limit:

(root@revolution 23:59:37)# service mysql restart
Shutting down MySQL (Percona Server)... SUCCESS! 
Starting MySQL (Percona Server).. SUCCESS! 
(root@revolution 23:59:45)# cat /proc/`pidof mysqld`/limits | egrep "(processes|files)"
Max processes      1024       unlimited       processes 
Max open files     4206       4206            files

Now, once this server hits 1024 simultaneous connections, we’ll start hitting “can’t create thread.” If this restart came about as part of a MySQL version upgrade, we might incorrectly (but justifiably) think that we’d encountered a bug in the newly-upgraded version, only to be further confounded by the error’s failure to disappear upon rolling back to the previous “error-free” version. What makes it all that much worse is that once we encounter this issue, it would appear that the only way we can actually correct it is to bounce MySQL yet again, since the only way for changes to the PAM limits files to get refreshed are via the start of a new session (such as logging out and logging back in). If we have a high-traffic MySQL server with a very large InnoDB buffer pool, having to bounce mysqld a second time could mean further minutes or hours of unacceptable performance degradation while the buffer pool warms up. If only we had a way to modify these resource limits on the fly, without a MySQL server restart….

As it turns out, we do. On kernel 2.6.32 and later (and luckily for us, 2.6.32 just happens to be the kernel version in use by RHEL 6), there’s actually a very handy way to immediately increase the resource limits for a running process without the need for a restart and all of the potential unpleasantness that comes along with it.[2] We can simply run the following command, as root, replacing SOFT_LIMIT and HARD_LIMIT with our desired process/thread limits. For no limit, just use the word “unlimited”; otherwise these should be numeric values.

echo -n "Max processes=SOFT_LIMIT:HARD_LIMIT" > /proc/`pidof mysqld`/limits

So what have we learned here?

If you’re running a Fedora 9/RHEL 6 derivative, you’re most likely affected by this issue, and the complete fix for it comes in three steps.

  1. Use the aforementioned “echo” trick to ensure that the running server’s limits are sufficient.
  2. Edit 90-nproc.conf (or create 91-mysql.conf) to set the desired process/thread limits to handle any server restarts.
  3. Ensure that you’ve logged out / logged back in before doing a restart; even if you make the changes in steps a and b, you still need to make sure that your “service mysql restart” is executed from a session that has picked up the modified limits.

[1] systemd, which became the Fedora standard a couple of versions ago and will likely be the new way of doing things in RHEL 7, appears to resolve the issue. See here for additional details (Thanks to Raghavendra Prabhu for the link.)
[2] Credit to Ovais Tariq for telling me about this trick.

The post Can’t Create Thread: Errno 11 (A Tale of Two Ulimits) appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com