ProxySQL vs MaxScale: Persistent Connection, response time, and bugs

A few days ago I came across the announcement that MaxScale 1.3 finally supports Persistent Connections.

ProxySQL has supported Persistent Connection since it was a prototype (sometime back in 2013). I am therefore very happy that the MaxScale Team finally introduced a long awaited feature.

However, MaxScale’s implementation has a serious drawback and I consider it a serious bug (more details towards the end of the article). A bug so serious that it shouldn’t reach any GA release.

I like running benchmarks, and given the new feature in MaxScale, I thought it a good time to compare ProxySQL vs MaxScale again. It’s been around 6 months since the previous benchmark.

ProxySQL vs MaxScale

Benchmark setup

The benchmark setup is very simple:

a physical server (20 CPU cores) running sysbench, proxysql and maxscale
a physical server (20 CPU cores) running 3 mysqld instances (1 master and 2 slaves) launched using MySQL Sandbox

Software version used:

MySQL 5.6.28
SysBench 0.4.12
ProxySQL v1.1.0 stable
MaxScale 1.3.0-beta

Configuration files can be found at the follow URLs:

ProxySQL configuration was completed running these commands through the admin interface:

INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply)
VALUES(1,1,'^SELECT.*FOR UPDATE, 1, 1);
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup, apply) VALUES(2,1,'^SELECT', 2, 1);
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;

Preliminary benchmarks confirm the conclusion of my previous blog post: MaxScale is very CPU intensive. Therefore, to perform a fair comparison between the two proxies, I ran both of them with only 1 worker thread.

Benchmarks were executed by running sysbench against the local proxy (either ProxySQL or MaxScale), using this command:

sysbench --max-requests=0 --test=oltp --mysql-user=rcannao --mysql-password=rcannao \
--mysql-db=test --oltp-table-size=1000000 --oltp-read-only=on --oltp-point-selects=1 \
--oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 \
--oltp-skip-trx=off --db-ps-mode=disable --max-time=60 \
--oltp-reconnect-mode=transaction --mysql-host=10.1.1.164 \
--num-threads=$thr  --mysql-port=$port run

What is important to note is that the workload is read-only: it performs only point selects, and it will reconnect at the end of each transaction. This is a workload meant to check the performance benefit of Persistent Connection.

The benchmark will compare:

ProxySQL configured with read/write split (RW)
MaxScale with readwritesplit module (RW)
MaxScale with readconnroute module (RR)
MaxScale with readwritesplit module and persistent connection (RW-PC)
MaxScale with readconnroute module and persistent connection (RR-PC)

Benchmark results

The benchmark result on throughput:

This gives us a lot of information, but some unanswered questions as well.

At very low concurrency, ProxySQL is slightly slower.
At 64 connections, ProxySQL and MaxScale RW-PC have very similar throughput. This is great since these two configurations have similar behaviour.
However, at 64 connections it seems that MaxScale without Persistent Connection has reached its maximum throughput. Seeing as how throughput with Persistent Connection is higher, we can already conclude that this feature is indeed useful and improves performance. MaxScale RR-PC will continue to give more throughput than the others, but this is expected as this routing module is very simple.
At 256 connections, throughput of ProxySQL and MaxScale RR-PC are the only two that continue to grow. This means that the other configurations have saturated 1 core and are unable to scale anymore. In the meantime ProxySQL continues to provide all of its features and scales with just 1 core.
At 1024 connections, all proxies configurations have a drop in performance. Although the drop in performance in ProxySQL is marginal, the drop in performance in MaxScale is severe.

This confirms ProxySQL’s ability to scale.

What about response time?

From this graph we notice that ProxySQL provides the better response time at high concurrency. Let’s remove the response time for 1024 connections and compare at lower concurrency:

But what about maximum response time?

This is really interesting:

No blue columns… did I forget to add the response time of ProxySQL? No, I didn’t, but the max response time of MaxScale is too high for a proper comparison.

Starting at 64 connections, the MaxScale’s maximum response time becomes so high that it reaches 60 seconds: this matches the max time in sysbench. The conclusion is therefore, that at least one (or perhaps more) connection created by sysbench isn’t able to complete a transaction until the end of the benchmark.

This needs to be further validated by its developers, but it seems that at high concurrency (even if 64 connections shouldn’t be considered “high”) MaxScale is only processing a subset of connections, while completely ignoring others. If that’s correct, this should be considered a serious bug.

For further testing, I rerun sysbench with 256 connections (not a lot, at all!) against MaxScale for 600 seconds, and the max response times was 600 seconds. That means at least one transaction (maybe more) wasn’t processed, until all the other transactions were completed.

I am sure nobody wants a transaction stuck for an undefined amount of time because the proxy is busy processing others.

For reference, check out the graph of max response times, without the off the charts values:

Maximum response time without the off the charts values

Severe bugs in Persistent Connection

The release note of MaxScale 1.3.0 hides a very important detail that is available only in the Administration Tutorial. I report it here for reference:

Please note that because persistent connections have previously been in use, they may give a different environment from a fresh connection. For example, if the previous use of the connection issued “use mydatabase” then this setting will be carried over into the reuse of the same connection. […] In exceptional cases this feature could be a problem.

If I read it correctly this is not a feature, but a severe series of bugs.

1. MaxScale doesn’t track current schema

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE()" test
+------------+
| DATABASE() |
+------------+
| test       |
+------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE()" mysql
+------------+
| DATABASE() |
+------------+
| test       |
+------------+

I assume nobody wants this to happen: the second client believes to connect to schema mysql. In reality it is connected to schema test. Unless your application is using only one schema, I strongly discourage the use of persistent connection.

2. MaxScale doesn’t track charset and returns incorrect encoding

Let’s try to identify another possible problem:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE(), @@session.character_set_client" 
mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | utf8                           |
+------------+--------------------------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SET NAMES latin1" mysql

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT DATABASE(), @@session.character_set_client" 
mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | latin1                         |
+------------+--------------------------------+

In this example, the current database is always incorrect (as already pointed out previously), but the character set is also compromised/corrupted. This can be a serious issue for many applications, as MaxScale is ignoring the charset as specified by the client.

That is, MaxScale is ignoring schemaname and charset as specified during the initial handshake.

3. MaxScale doesn’t track autocommit

The same applies for autocommit…

$ mysql -u rcannao -prcannao -h 10.1.1..164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SET autocommit=0"

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | OFF   |
+---------------+-------+

Here we can see another major issue/bug: an application could issue statements assuming autocommit=ON (the default), while in reality another client could have changed it.

4. MaxScale doesn’t track transactions

I think this is perhaps the most serious bug of how Persistent Connection are implemented in MaxScale.

Without MaxScale, when a client disconnects its transaction should be rolled back. Let’s see what happens with MaxScale and Persistent Connection.

First, we create a transaction the way many applications do: SET autocommit=0, followed by any DML:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+
$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SET autocommit=0"

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | OFF   |
+---------------+-------+

So far, we have modified autocommit in a Persistent connection. Second, let’s run some SELECT statement:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT COUNT(*) FROM sbtest" test
ERROR 1046 (3D000) at line 1: No database selected

Oops, error… I forgot that MaxScale ignores my request for a default schema. Now I must specify it in the query itself!

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT COUNT(*) FROM test.sbtest" test
+----------+
| COUNT(*) |
+----------+
|  1000000 |
+----------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SELECT COUNT(*) FROM test.sbtest WHERE id < 1000" test
+----------+
| COUNT(*) |
+----------+
|      999 |
+----------+

All looks good so far. Let me check if there are active transactions:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE
$

No active transactions, that’s good. Now, let’s run a DML statement:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "DELETE FROM test.sbtest WHERE id < 1000"
$

Let me check again, if there are any active transactions:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE
---TRANSACTION 2253315, ACTIVE 29 sec
$

So that’s where the bug is! The client that issued the DML statement and started the transaction is gone/disconnected, yet MaxScale is holding a transaction open.

The bad news is that MaxScale doesn’t track transaction no matter if they were started due to autocommit or an explicit START TRANSACTION. Here’s an example (after restarting maxscale):

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "START TRANSACTION; SELECT 1 FROM test.sbtest LIMIT 1" test
+---+
| 1 |
+---+
| 1 |
+---+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P3307 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE
---TRANSACTION 2253317, ACTIVE 2 sec

$

ProxySQL in comparison

A safe environment with Persistent Connection

By comparison, ProxySQL has a more mature implementation of Persistent Connection, and keeps track of the environment set by the client. Thus it ensures that the environments of backend and frontend match.

1. ProxySQL tracks current schema

ProxySQL isn’t affected by the same bug as MaxScale. Consequently, it correctly tracks the schema, as specified by the client:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE()" test
+------------+
| DATABASE() |
+------------+
| test       |
+------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE()" mysql
+------------+
| DATABASE() |
+------------+
| mysql      |
+------------+

2. ProxySQL tracks character set

As opposed to MaxScale, ProxySQL also tracks the character set, as specified by the client:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE(), @@session.character_set_client" 
mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | utf8                           |
+------------+--------------------------------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SET NAMES latin1" mysql

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SELECT DATABASE(), @@session.character_set_client" 
mysql --default-character-set=utf8
+------------+--------------------------------+
| DATABASE() | @@session.character_set_client |
+------------+--------------------------------+
| mysql      | utf8                           |
+------------+--------------------------------+

3. ProxySQL tracks autocommit

Once again, ProxySQL does not have MaxScale’s bug and correctly tracks the value of autocommit, as specified by the client:

$ mysql -u rcannao -prcannao -h 10.22.20.164 -P6033 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SET autocommit=0"

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SHOW VARIABLES LIKE 'autocommit'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+

4. ProxySQL tracks transactions

In this case as well, MaxScale’s bug does not affect ProxySQL. It correctly tracks transactions, terminating them if required:

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "START TRANSACTION; SELECT 1 FROM test.sbtest LIMIT 1" test
+---+
| 1 |
+---+
| 1 |
+---+

$ mysql -u rcannao -prcannao -h 10.1.1.164 -P6033 -e "SHOW ENGINE INNODB STATUS\G" test | grep ACTIVE

$

The Bottomline

So why does ProxySQL implement better Persistent Connection?

Since its initial implementation 2 years ago, ProxySQL was designed to handle frontends (clients) and backends (servers) as different entities. It only connects them when needed and removes the link between them as soon as possible.

When a client connects to ProxySQL it doesn’t establish connections to any backend. When a client issues a request ProxySQL determines if the request needs a connection to the backend or not. Only if it’s required does it forward the request. As soon as the request is completed, ProxySQL determines if the connection is still required. If not, it returns it to a connection pool. The connection to the backend is returned to the connection pool not when the client disconnects, but when the request is completed, assuming that it can be reused by other clients (for example if there are no active transactions).

Similarly, when a backend connection is linked to a client connection, ProxySQL will ensure that the environment is set correctly: schema, charset, autocommit, etc.

In other words, ProxySQL doesn’t just implement Persistent Connection, but also Connection Multiplexing. In fact, ProxySQL can handle hundreds of thousands of clients, yet forward all of their traffic through few connections to the backend.

I invite everybody to try ProxySQL, now GA, and feel free to contact me for any question.