Feb
25
2019
--

MySQL Challenge: 100k Connections

thread pools MySQL 100k connections

In this post, I want to explore a way to establish 100,000 connections to MySQL. Not just idle connections, but executing queries.

100,000 connections. Is that really needed for MySQL, you may ask? Although it may seem excessive, I have seen a lot of different setups in customer deployments. Some deploy an application connection pool, with 100 application servers and 1,000 connections in each pool. Some applications use a “re-connect and repeat if the query is too slow” technique, which is a terrible practice. It can lead to a snowball effect, and could establish thousands of connections to MySQL in a matter of seconds.

So now I want to set an overachieving goal and see if we can achieve it.

Setup

For this I will use the following hardware:

Bare metal server provided by packet.net, instance size: c2.medium.x86
Physical Cores @ 2.2 GHz
(1 X AMD EPYC 7401P)
Memory: 64 GB of ECC RAM
Storage : INTEL® SSD DC S4500, 480GB

This is a server grade SATA SSD.

I will use five of these boxes, for the reason explained below. One box for the MySQL server and four boxes for client connections.

For the server I will use Percona  Server for MySQL 8.0.13-4 with the thread pool plugin. The plugin will be required to support the thousands of connections.

Initial server setup

Network settings (Ansible format):

- { name: 'net.core.somaxconn', value: 32768 }
- { name: 'net.core.rmem_max', value: 134217728 }
- { name: 'net.core.wmem_max', value: 134217728 }
- { name: 'net.ipv4.tcp_rmem', value: '4096 87380 134217728' }
- { name: 'net.ipv4.tcp_wmem', value: '4096 87380 134217728' }
- { name: 'net.core.netdev_max_backlog', value: 300000 }
- { name: 'net.ipv4.tcp_moderate_rcvbuf', value: 1 }
- { name: 'net.ipv4.tcp_no_metrics_save', value: 1 }
- { name: 'net.ipv4.tcp_congestion_control', value: 'htcp' }
- { name: 'net.ipv4.tcp_mtu_probing', value: 1 }
- { name: 'net.ipv4.tcp_timestamps', value: 0 }
- { name: 'net.ipv4.tcp_sack', value: 0 }
- { name: 'net.ipv4.tcp_syncookies', value: 1 }
- { name: 'net.ipv4.tcp_max_syn_backlog', value: 4096 }
- { name: 'net.ipv4.tcp_mem', value: '50576   64768 98152' }
- { name: 'net.ipv4.ip_local_port_range', value: '4000 65000' }
- { name: 'net.ipv4.netdev_max_backlog', value: 2500 }
- { name: 'net.ipv4.tcp_tw_reuse', value: 1 }
- { name: 'net.ipv4.tcp_fin_timeout', value: 5 }

These are the typical settings recommended for 10Gb networks and high concurrent workloads.

Limits settings for systemd:

[Service]
LimitNOFILE=1000000
LimitNPROC=500000

And the relevant setting for MySQL in my.cnf:

back_log=3500
max_connections=110000

For the client I will use sysbench version 0.5 and not 1.0.x, for the reasons explained below.

The workload is

sysbench --test=sysbench/tests/db/select.lua --mysql-host=139.178.82.47 --mysql-user=sbtest --mysql-password=sbtest --oltp-tables-count=10 --report-interval=1 --num-threads=10000 --max-time=300 --max-requests=0 --oltp-table-size=10000000 --rand-type=uniform --rand-init=on run

Step 1. 10,000 connections

This one is very easy, as there is not much to do to achieve this. We can do this with only one client. But you may face the following error on the client side:

FATAL: error 2004: Can't create TCP/IP socket (24)

This is caused by the open file limit, which is also a limit of TCP/IP sockets. This can be fixed by setting  

ulimit -n 100000

  on the client.

The performance we observe:

[  26s] threads: 10000, tps: 0.00, reads: 33367.48, writes: 0.00, response time: 3681.42ms (95%), errors: 0.00, reconnects:  0.00
[  27s] threads: 10000, tps: 0.00, reads: 33289.74, writes: 0.00, response time: 3690.25ms (95%), errors: 0.00, reconnects:  0.00

Step 2. 25,000 connections

With 25,000 connections, we hit an error on MySQL side:

Can't create a new thread (errno 11); if you are not out of available memory, you can consult the manual for a possible OS-dependent bug

If you try to lookup information on this error you might find the following article:  https://www.percona.com/blog/2013/02/04/cant_create_thread_errno_11/

But it does not help in our case, as we have all limits set high enough:

cat /proc/`pidof mysqld`/limits
Limit                     Soft Limit Hard Limit           Units
Max cpu time              unlimited  unlimited            seconds
Max file size             unlimited  unlimited            bytes
Max data size             unlimited  unlimited            bytes
Max stack size            8388608    unlimited            bytes
Max core file size        0          unlimited            bytes
Max resident set          unlimited  unlimited            bytes
Max processes             500000     500000               processes
Max open files            1000000    1000000              files
Max locked memory         16777216   16777216             bytes
Max address space         unlimited  unlimited            bytes
Max file locks            unlimited  unlimited            locks
Max pending signals       255051     255051               signals
Max msgqueue size         819200     819200               bytes
Max nice priority         0          0
Max realtime priority     0          0
Max realtime timeout      unlimited unlimited            us

This is where we start using the thread pool feature:  https://www.percona.com/doc/percona-server/8.0/performance/threadpool.html

Add:

thread_handling=pool-of-threads

to the my.cnf and restart Percona Server

The results:

[   7s] threads: 25000, tps: 0.00, reads: 33332.57, writes: 0.00, response time: 974.56ms (95%), errors: 0.00, reconnects:  0.00
[   8s] threads: 25000, tps: 0.00, reads: 33187.01, writes: 0.00, response time: 979.24ms (95%), errors: 0.00, reconnects:  0.00

We have the same throughput, but actually the 95% response time has improved (thanks to the thread pool) from 3690 ms to 979 ms.

Step 3. 50,000 connections

This is where we encountered the biggest challenge. At first, trying to get 50,000 connections in sysbench we hit the following error:

FATAL: error 2003: Can't connect to MySQL server on '139.178.82.47' (99)

Error (99) is cryptic and it means: Cannot assign requested address.

It comes from the limit of ports an application can open. By default on my system it is

cat /proc/sys/net/ipv4/ip_local_port_range : 32768   60999

This says there are only 28,231 available ports — 60999 minus 32768 — or the limit of TCP connections you can establish from or to the given IP address.

You can extend this using a wider range, on both the client and the server:

echo 4000 65000 > /proc/sys/net/ipv4/ip_local_port_range

This will give us 61,000 connections, but this is very close to the limit for one IP address (maximal port is 65535). The key takeaway from here is that if we want more connections we need to allocate more IP addresses for MySQL server. In order to achieve 100,000 connections, I will use two IP addresses on the server running MySQL.

After sorting out the port ranges, we hit the following problem with sysbench:

sysbench 0.5:  multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 50000
FATAL: pthread_create() for thread #32352 failed. errno = 12 (Cannot allocate memory)

In this case, it’s a problem with sysbench memory allocation (namely lua subsystem). Sysbench can allocate memory for only 32,351 connections. This is a problem which is even more severe in sysbench 1.0.x.

Sysbench 1.0.x limitation

Sysbench 1.0.x uses a different Lua JIT, which hits memory problems even with 4000 connections, so it is impossible to go over 4000 connection in sysbench 1.0.x

So it seems we hit a limit with sysbench sooner than with Percona Server. In order to use more connections, we need to use multiple sysbench clients, and if 32,351 connections is the limit for sysbench, we have to use at least four sysbench clients to get up to 100,000 connections.

For 50,000 connections I will use 2 servers (each running separate sysbench), each running 25,000 threads from sysbench.

The results for each sysbench looks like:

[  29s] threads: 25000, tps: 0.00, reads: 16794.09, writes: 0.00, response time: 1799.63ms (95%), errors: 0.00, reconnects:  0.00
[  30s] threads: 25000, tps: 0.00, reads: 16491.03, writes: 0.00, response time: 1800.70ms (95%), errors: 0.00, reconnects:  0.00

So we have about the same throughput (16794*2 = 33588 tps in total), however the 95% response time doubled. This is to be expected as we are using twice as many connections compared to the 25,000 connections benchmark.

Step 3. 75,000 connections

To achieve 75,000 connections we will use three servers with sysbench, each running 25,000 threads.

The results for each sysbench:

[ 157s] threads: 25000, tps: 0.00, reads: 11633.87, writes: 0.00, response time: 2651.76ms (95%), errors: 0.00, reconnects:  0.00
[ 158s] threads: 25000, tps: 0.00, reads: 10783.09, writes: 0.00, response time: 2601.44ms (95%), errors: 0.00, reconnects:  0.00

Step 4. 100,000 connections

There is nothing eventful to achieve75k and 100k connections. We just spin up an additional server and start sysbench. For 100,000 connections we need four servers for sysbench, each shows:

[ 101s] threads: 25000, tps: 0.00, reads: 8033.83, writes: 0.00, response time: 3320.21ms (95%), errors: 0.00, reconnects:  0.00
[ 102s] threads: 25000, tps: 0.00, reads: 8065.02, writes: 0.00, response time: 3405.77ms (95%), errors: 0.00, reconnects:  0.00

So we have the same throughput (8065*4=32260 tps in total) with 3405ms 95% response time.

A very important takeaway from this: with 100k connections and using a thread pool, the 95% response time is even better than for 10k connections without a thread pool. The thread pool allows Percona Server to manage resources more efficiently and provides better response times.

Conclusions

100k connections is quite achievable for MySQL, and I am sure we could go even further. There are three components to achieve this:

  • Thread pool in Percona Server
  • Proper tuning of network limits
  • Using multiple IP addresses on the server box (one IP address per approximately 60k connections)

Appendix: full my.cnf

[mysqld]
datadir {{ mysqldir }}
ssl=0
skip-log-bin
log-error=error.log
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
character_set_server=latin1
collation_server=latin1_swedish_ci
skip-character-set-client-handshake
innodb_undo_log_truncate=off
# general
table_open_cache = 200000
table_open_cache_instances=64
back_log=3500
max_connections=110000
# files
innodb_file_per_table
innodb_log_file_size=15G
innodb_log_files_in_group=2
innodb_open_files=4000
# buffers
innodb_buffer_pool_size= 40G
innodb_buffer_pool_instances=8
innodb_log_buffer_size=64M
# tune
innodb_doublewrite= 1
innodb_thread_concurrency=0
innodb_flush_log_at_trx_commit= 0
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10
innodb_lru_scan_depth=2048
innodb_page_cleaners=4
join_buffer_size=256K
sort_buffer_size=256K
innodb_use_native_aio=1
innodb_stats_persistent = 1
#innodb_spin_wait_delay=96
innodb_adaptive_flushing = 1
innodb_flush_neighbors = 0
innodb_read_io_threads = 16
innodb_write_io_threads = 16
innodb_io_capacity=1500
innodb_io_capacity_max=2500
innodb_purge_threads=4
innodb_adaptive_hash_index=0
max_prepared_stmt_count=1000000
innodb_monitor_enable = '%'
performance_schema = ON

Jan
29
2014
--

Percona Server: Thread Pool Improvements for Transactional Workloads

In a previous thread pool post, I mentioned that in Percona Server we used an open source implementation of MariaDB’s thread pool, and enhanced/improved it further. Below I would like to describe some of these improvements for transactional workloads.

When we were evaluating MariaDB’s thread pool implementation, we observed that it improves scalability for AUTOCOMMIT statements. However, it does not scale well with multi-statement transactions. The UPDATE_NO_KEY test which was run as an AUTOCOMMIT statement and inside a transaction gave the following results:

thread_pool.p2.mariadb.v3

After analysis, we identified the major cause of that inefficiency: High latency between individual statements inside transactions. This looked pretty similar to the case when transactions are executed in parallel without thread pool. Latency there is high as well, though the cause of high latency in these two cases is different.

  • In the “one-thread-per-connection” case, with 1000 connections, higher latency is caused by increased contention on accessing MySQL server shared resources like structures/locks/etc.
  • In the case of “pool-of-threads”, 1000 client connections will be organized into thread_pool_size groups (or to be more specific into thread_pool_size queues), and latency here comes not from contention as we have much smaller number of parallel threads. It comes from the execution order of individual statements in transactions. Suppose that you have 100 identical transactions (each with 4 statements in it) in the thread group queue. As transactions are processed and executed sequentially, statements of transaction T1 will be placed at the following positions in the thread pool queue: 1…101…201…301…401. So in case of a uniform workload distances between statements in the transaction will be 100. This way transaction T1 may hold server resources during execution of all statements in the thread pool queue between 1 and 401, and that has a negative impact on performance.

In an ideal world, the number of concurrent transactions does not matter, as long as we keep the number of concurrent statements sufficiently low. Reality is different though. An open transaction which is not currently executing a statement may still block other connections by holding metadata or row-level locks. On top of that, any MVCC implementation should examine states of all open transactions and thus may perform less efficiently with large numbers of transactions (we blogged about InnoDB-specific problems here and here).

In order to help execute transactions as fast as possible we introduced high and low priority queues for thread pool. Now with default thread pool settings, we check every incoming statement, and if it is from an already started transaction we put it into the the high priority queue, otherwise it will go in the low priority queue.

high_priority_diagram.v2

Such reordering allowed to notably reduce latency and resulted in very good scalability up to a very high number of threads. You can find details about this optimization in our documentation.

Now let’s check how these introduced changes will affect the workload we mentioned earlier in this article.

The next graphs show the results for the UPDATE_NO_KEY test that was run as an AUTOCOMMIT statement, and inside a transaction for MariaDB with thread_pool and Percona Server with the thread_pool_high_priority mode=statements – which is very similar to behavior of thread_pool in MariaDB and Percona Server with thread_pool_high_priority mode=transactions – optimization that performs statements reordering of the transactions in the thread pool queue.

thread_pool.p2.mariadb.percona.v2
This works even more efficiently on larger transactions like OLTP_RW from sysbench. See the graphs below for the same set of servers:

IO bound: sysbench dataset 32 tables/12M rows each (~100GB), InnoDB buffer pool=25GB
thread_pool.p2.io_bound

As seen, we get nearly flat throughput with thread_pool_high_prio_mode=transactions even with very high numbers of users connections.

The post Percona Server: Thread Pool Improvements for Transactional Workloads appeared first on MySQL Performance Blog.

Mar
16
2013
--

SimCity outages, traffic control and Thread Pool for MySQL

For this post I’m going to shamelessly exploit the litany of technical problems SimCity players encountered earlier this month and a few examples of how Thread Pool for MySQL and Percona Server for MySQL can help to prevent such incidents.

Users of SimCity, a city-building and urban planning simulation video game, encountered network outages, issues with saving progress and problems connecting to the game’s servers following a new release a couple of weeks ago featuring a new engine allowing for more detailed simulation than previous games. During this same time, we happened to be testing the Thread Pool feature in Percona Server for MySQL, and I saw a connection of how Thread Pool is supposed to help in such cases.

Basically SimCity users faced the same situation San Francisco Bay Area commuters face every day — traffic gridlock, probably worst in the United States.
4649108931_f672383ec1_o
(the image is taken from izahorsky flickr)

There is also a well-known dependency from the queuing theory that shows the same thing: Sketch15521334-1
The more users who are trying to access a shared resource, the worse the response time is, growing exponentially.

To make things even worse, very often we see the following pattern with throughput.
Sketch15521334-2
Basically the more users who are coming, the less requests/sec the system is able to handle. The reason is that working processes require internal communication and synchronization. At some point the system becomes unresponsive and practically stuck. Well-known DDoS attacks are based on this behavior — to load a system with so many requests that it is not able to proceed.

What is the solution here? If you allow me again an analogy with traffic: you have probably at some point encountered “on-ramp metering” solutions, traffic lights that throttle how many vehicles can enter a highway from an on-ramp during peak traffic times:
Ramp Metering
The intention of this solution is to provide smooth throughput when you are already on the highway.

The similar idea is behind Thread Pool for MySQL. We want to provide this kind of throughput:
Sketch15521334-4

This is achieved by limiting the amount of working threads inside MySQL. So Thread Pool for MySQL does not improve the performance magically by itself, but it is rather a protection for cases when MySQL may become suddenly overloaded.

The Thread Pool feature is available in Percona Server 5.5.30 and Percona Server 5.6.10-alpha and there are some reports from users with visible improvements, i.e:
“Percona thread pool in the real world, tremendously improves lock convoy issues: http://buff.ly/12FUiYf , Activated on 7th around noon”
MySQL Performance

While a similar plugin is available from Oracle by MySQL Enterprise subscription, you can get it and try it with Percona Server for MySQL right now for free.

UPDATE from 3/21/2013 on Credits: The current implementation of Thread Pool in Percona Server is taken from MariaDB and was developed by Vladislav Vaintroub. The commercial implementation is developed in Oracle/MySQL and I do not know who original author is.

The post SimCity outages, traffic control and Thread Pool for MySQL appeared first on MySQL Performance Blog.

Oct
27
2010
--

MySQL Limitations Part 4: One thread per connection

This is the third in a series on what’s seriously limiting MySQL in core use cases (links: part 1, 2, 3). This post is about the way MySQL handles connections, allocating one thread per connection to the server.

MySQL is a single process with multiple threads. Not all databases are architected this way; some have multiple processes that communicate through shared memory or other means. It’s cheap to create a connection to MySQL, because it just requires creating a thread (or taking one from a cache). This is generally so fast that there isn’t really the need for connection pools as there is with other databases, at least not in the same way. Windows in particular has had excellent threading support practically forever; Linux has very good threading now, but that wasn’t always the case.

However, many development environments and programming languages really want a connection pool. They’re just built that way (I’m looking at you, Java). And many others use persistent connections by default, so that a connection isn’t really closed when it’s closed; it’s kind of like a connection pool, except that the connection is persisted from request to request within the same process, rather than being shared with whichever request needs a connection.

Connection pools and persistent connections combined with a large number of application servers can lead to a situation where the database server has a very large number of connections open to it, most of which are doing nothing. It’s not uncommon for me to see a server with 1000 to 5000 connections open, and maybe one to three are actually running queries on average. These connections originate from dozens to hundreds of application server instances. When you have a heavily sharded or otherwise horizontally scaled application, it’s not only easy to get into this pickle, it’s really hard or impossible to avoid it.

And with 5000 connections open, you get 5000 threads in the server. That increases the overhead from thread scheduling, and potentially memory usage as well. I feel like I’m forgetting some reasons that this matters — please fill in whatever’s missing in the comments.

There can be more than one solution to this problem, but the one that’s actually partially implemented is a pool of threads, which was originally coded for MySQL 6.0, but is available now in MariaDB.

Unfortunately it isn’t a full solution, because it can cause undesirable lock-out or waiting, and the specific implementation has a scalability bottleneck on multicore servers. Mark Callaghan has done much more investigation of the pool of threads than I have. There are more details in this blog post by Mark, and two followup blog posts from Tim Cook (1, 2).

Thanks for the great comments on the last post. Some of them were good guesses. Remember that the context for this series isn’t micro-limitations or edge-case badness (even if they are serious in some cases), but rather a focus on shortcomings in the main use cases for the server. There are a lot of things MySQL doesn’t do well, but it doesn’t matter that much, because that’s not what it’s designed for. Wrong tool, wrong use, NotABug. I’m thinking of the lack of sort-merge joins or intra-query parallelism, for example. It would be lovely to have those things, if you’re running a data warehouse on MySQL, and in some cases for other uses too (note that most databases that do have these query plans usually try to use nested-loop joins whenever possible, because of things like the lower startup cost for the query). But MySQL isn’t a data warehouse DBMS first and foremost. It’s a general-purpose OLTP database server that runs well on affordable hardware and is great for Web usage. It’s so good, in fact, that it can be used for tons of other things such as… data warehousing. But it isn’t a Netezza or Paraccel, and if it were, it wouldn’t be a great OLTP web database too.

MySQL replication is one of the core, fundamental features — and it’s single-threaded and relies on the binary log, which are two major limitations. And it has subqueries, which are a core, fundamental part of SQL — but it’s bad at certain kinds of them. That’s why I listed those as major limitations. And because MySQL is a multi-threaded database for Web usage that tends to be used in sharded environments with tons of application servers, which creates a situation with many thousands of connections to the database, and because it doesn’t handle that very well, I list its one-thread-per-connection design as a serious limitation.


Entry posted by Baron Schwartz |
13 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com