Mar
06
2015
--

What stopped MySQL? Tracing back signals sent to MySQL

Have you ever had a case where you needed to find a process which sent a HUP/KILL/TERM or other signal to your database? Let me rephrase. Did you ever have to find which process messed up your night? ;) If so, you might want to read on. I’m going to tell you how you can find it.

Granted, on small and/or meticulously managed systems tracking down the culprit is probably not a big deal. You can likely identify your process simply by checking what processes have enough privileges to send mysqld a HUP/KILL/TERM signal. However, frequently we see cases where this may not work or the elimination process would be too tedious to execute.

We recently had a case where a process was frequently sending SIGHUPs to mysqld and the customer asked us to see if we could get rid of his annoyance. This blog is the direct result of a discussion I had with my colleague Francisco Bordenave, on options available to deal with his issue. I’m only going to cover a few of them in this blog but I imagine that most of you will be able to find one that will work for your case. Note that most tracing tools add some overhead to the system being investigated. The tools presented in the following are designed to be lightweight so the impact should be well within acceptable range for most environments.

DISCLAIMER: While writing this blog I discovered that David Busby has also discussed one of the tools that I’m going to cover in his article. For those who have read the article note that I’m going to cover other tools as well and I will also cover a few extra SystemTap details in this blog. For those who haven’t yet had chance to read David’s blog, you can read it here.

All right, let’s see what “low hanging tools” there are available to us to deal with our issue!

Linux

  • SystemTap: widely available on Linux but usually not enabled by default. You need to install debuginfo and devel kernel packages and systemtap itself. Similar to DTrace.
  • Perf: although not quite written for generic tracing, due to its ability to trace system calls we can use it to our advantage if we trace sys_enter_sigkill.
  • Audit: generic system auditing platform. Given its nature, we can use it to track down many things, including rogue processes sending HUP signals to our poor mysqld!
  • Code!: Given that MySQL is opensource, you could customize the signal handler to obtain extra information. See more in sigaction(2) and the SA_SIGINFO flag. I’m not sure if this should be listed as a more efficient solution but it’s an option nevertheless. I guess one could also preload/inject his own singal handler via an LD_PRELOAD trick and a custom library but that’s beyond the scope what I intend to cover. However, for certain signals (most notably, SIGSEGV) you may not need to write your own tools as the OS may already come with libs/tools that can assist you. See Ulrich Drepper’s catchsegv or /usr/lib64/libSegFault.so, for instance.
  • Debuggers: These may be efficient to use in some cases but I won’t cover them this time, either.

FreeBSD/Solaris

  • DTrace: a very decent, stable tracing platform. Included in most recent kernels by default for the mentioned platforms (FreeBSD 9.2+, FreeBSD 10+, Solaris 10+).

In this article I’m going to focus on Linux as that’s what people in the MySQL community seem to care about most nowadays. The tools that I will discuss will be SystemTap, Perf and Audit. If you feel that you would like to read about the rest, let me know and I will cover the rest of the options in a followup article.

SystemTap

I’m going to set up SystemTap on a recent, 64 bit CentOS 7 box. I will only cover basic install, you can find more about how to install SystemTap here.

The strength of SystemTap is definitely its flexibility, potentially the best tool for solving our problem on the Linux platform. It’s been around for some time and is generally regarded mature but I would recommend to test your “tapscripts” in dev/qa before you run them in production.

Installing SystemTap

Follow below steps to install SystemTap:

[root@centos7]~# sed -i 's/enabled=0/enabled=1/' /etc/yum.repos.d/CentOS-Debuginfo.repo
[root@centos7]~# yum repolist
...
base-debuginfo/x86_64                         CentOS-7 - Debuginfo                                          1,688
...

[root@centos7]~# yum install kernel-debuginfo kernel-debuginfo-common kernel-devel
[root@centos7]~# yum install systemtap systemtap-runtime

Tracing with SystemTap

Create a tapscript like the one below:

[root@centos7]~# cat find_sighupper.stp
#!/usr/bin/stap
# Prints information on process which sent HUP signal to mysqld
probe begin {
  printf("%-26s %-8s %-5s %-8s %-5sn", "TIME", "SOURCE", "SPID", "TARGET", "TPID");
}
probe nd_syscall.kill.return {
  sname = @entry(execname());
  spid = @entry(pid());
  sig = @entry(uint_arg(2));
  tpid = @entry(uint_arg(1));
  tname = pid2execname(tpid);
  time = ctime(gettimeofday_s());
  if (sig == 1 && tname == "mysqld")
    printf("%-26s %-8s %-5d %-8s %-5dn", time, sname, spid, tname, tpid);
}

Then run the tap script in a dedicated terminal:

[root@centos7]~# stap find_sighupper.stp
TIME                       SOURCE   SPID  TARGET   TPID

Send your HUP signal to mysqld from another terminal:

[root@centos7]~# kill -1 1984

The culprit should will show up on your first window like so:

[root@centos7]~# stap find_sighupper.stp
TIME                       SOURCE   SPID  TARGET   TPID
Thu Feb 26 21:20:44 2015   kill     6326  mysqld   1984
^C

Note that with this solution I was able to define fairly nice constraints relatively easily. With a single probe (well, quasi, as @entry refers back to the callee) I was able to get all this information and filter out HUP signals sent to mysqld. No other filtering is necessary!

Perf

Perf is another neat tool to have. As its name implies, it was originally developed for lightweight profiling, to use the performance counters subsystem in Linux. It became fairly popular and got extended many times over these past years. Since it happens to have probes we can leverage, we are going to use it!

Installing Perf

As you can see, installing Perf is relatively simple.

# yum install perf

Start perf in a separate terminal window. I’m only going to run it for a minute but I could run it in screen for a longer period of time.

[root@centos7 ~]# perf record -a -e syscalls:sys_enter_kill sleep 60

In a separate terminal window send your test and obtain the results via “perf script”:

[root@centos7 ~]# echo $$
11380
[root@centos7 ~]# pidof mysqld
1984
[root@centos7 ~]# kill -1 1984
[root@centos7 ~]# perf script
# ========
# captured on: Thu Feb 26 14:25:02 2015
# hostname : centos7.local
# os release : 3.10.0-123.20.1.el7.x86_64
# perf version : 3.10.0-123.20.1.el7.x86_64.debug
# arch : x86_64
# nrcpus online : 2
# nrcpus avail : 2
# cpudesc : Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
# cpuid : GenuineIntel,6,70,1
# total memory : 1885464 kB
# cmdline : /usr/bin/perf record -a -e syscalls:sys_enter_kill sleep 60
# event : name = syscalls:sys_enter_kill, type = 2, config = 0x9b, config1 = 0x0, config2 = 0x0, excl_usr = 0, exc
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: software = 1, tracepoint = 2, breakpoint = 5
# ========
#
            bash 11380 [000]  6689.348219: syscalls:sys_enter_kill: pid: 0x000007c0, sig: 0x00000001

As you can see in above output process “bash” with pid of 11380 signalled pid 0x07c0 (decimal: 1984) a HUP signal (0x01). Thus, we found our culprit with this method as well.

Audit

You can read more about Audit in the Red Hat Security Guide.

Installing Audit

Depending on your OS installation, it may be already installed.

If case it is not, you can install it as follows:

[root@centos7 ~]# yum install audit

When you are done installing, start your trace and track 64 bit kill system calls that send HUP signals with signal ID of 1:

[root@centos7]~# auditctl -l
No rules
[root@centos7]~# auditctl -a exit,always -F arch=b64 -S kill -F a1=1
[root@centos7]~# auditctl -l
LIST_RULES: exit,always arch=3221225534 (0xc000003e) a1=1 (0x1) syscall=kill
[root@centos7]~# auditctl -s
AUDIT_STATUS: enabled=1 flag=1 pid=7010 rate_limit=0 backlog_limit=320 lost=0 backlog=0
[root@centos7]~# pidof mysqld
1984
[root@centos7]~# kill -1 1984
[root@centos7]~# tail -2 /var/log/audit/audit.log
type=SYSCALL msg=audit(1425007202.384:682): arch=c000003e syscall=62 success=yes exit=0 a0=7c0 a1=1 a2=a a3=7c0 items=0 ppid=11380 pid=3319 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 comm="zsh" exe="/usr/bin/zsh" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
type=OBJ_PID msg=audit(1425007202.384:682): opid=1984 oauid=-1 ouid=995 oses=-1 obj=system_u:system_r:mysqld_t:s0 ocomm="mysqld"

As you can see from above output, the results showed up nicely in the system audit.log. From the log it’s clear that I sent my SIGHUP to mysqld (pid 1984, “opid” field) from zsh (see the command name in the “comm” field) via the 64 bit kill syscall. Thus, mischief managed, once again!

Summary

In this blog I presented you three different tools to help you trace down sources of signals. The three tools each have their own strengths. SystemTap is abundant of features and really nicely scriptable. The additional features of auditd may make it appealing to deploy to your host. Perf is a great tool for CPU profiling and you might want to install it solely for that reason. On the other hand, your distribution might not have support compiled in its kernel or may make the setup harder for given tool. In my experience most modern distributions support the tools discussed here so the choice comes down to personal preference or convenience.

In case you were wondering, I often pick auditd because it is often already installed. SystemTap might be a bit more complicated to setup but I would likely invest some extra time into the setup if my case is more complex. I primary use perf for CPU tracing and tend to think of the other two tools before I think of perf for tracing signals.

Hope you enjoyed reading! Happy [h/t]racking!

The post What stopped MySQL? Tracing back signals sent to MySQL appeared first on MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com