ProxySQL’s new routing algorithm to handle hundreds of thousands of schemas/shards
ProxySQL is a service designed to scale and handle traffic in very large setups.
In the past we already showed examples of how ProxySQL can handle thousands of MySQL servers, and millions (yes, millions!) of distinct users.
Although, recently a new challenge raised: can ProxySQL perform routing based on MySQL schemaname for a vary large number of schemas?
We know that in production setups ProxySQL is already performing routing to MySQL servers based on schemaname for few hundreds schemas, but can it handle thousands or more?
Technically there is no limit on how many shards ProxySQL can handle, as long as there are rules for them.
Tibor Korocz already pointed in a blog post that more query rules can affect performance.
In case of query routing based on schemaname
no regular expressions are involved (therefore each rule is not expensive), but the same principle applies: the more rules need to be evaluated, longer it takes to make a decision (in this case to perform the routing).
Scope of this blog post is to understand the performance implication of performing routing based on the number of rules.
To simplify the setup to run some benchmark, we used only 1 MySQL server as backend, and we created 50000 schemas:
for i in `seq 10001 60000` ; do
echo "CREATE DATABASE IF NOT EXISTS shard_$i;"
done | mysql -u root
To run the benchmark we used a various number of rules.
For every test, we configured ProxySQL setting the right number of rules: for 100 schemas we created 100 rules, for 1000 schema 1000 rules, etc.
For example, for 200 rules we used this configuration:
(
echo "DELETE FROM mysql_query_rules; INSERT INTO mysql_query_rules (active,username,cache_ttl) VALUES (1,\"sbtest\",120000);"
for i in `seq 10001 10200` ; do
echo "INSERT INTO mysql_query_rules (active,username,schemaname,destination_hostgroup,apply) VALUES (1,\"sbtest\",\"shard_$i\",1,1);"
done
echo "LOAD MYSQL QUERY RULES TO RUNTIME;"
) | mysql -u admin -padmin -h 127.0.0.1 -P6032
We then ran queries as the following:
for j in `seq 1 50000` ; do
echo "USE shard_$(($RANDOM%200+10001))" ; echo "SELECT 1;"
done | mysql -u sbtest -psbtest -h 127.0.0.1 -P6033 --ssl-mode=disabled -NB > /dev/null
It is relevant to note that we aren’t interested in the total execution time (this is why we used this simple one-liner to generate traffic), but in the time spent inside the Query Processor.
Because the execution time of the queries is not relevant, we also configured the query cache.
To compute the time spent in the Query Processor we set variable mysql-stats_time_query_processor='true'
and computed the value of status variable Query_Processor_time_nsec
. This status variable measures the time spent inside the Query Processor, in nanoseconds.
As we expected, the more query rules ProxySQL needs to evaluate, the longer it takes to compute the destination hostgroup for that schemaname. This results in latency before executing the query:
The graph above shows that for 10 rules the average latency is around 1us (microsecond), for 100 rules the average latency is 2us , and that it gradually grows to 6us for 500 rules and 256us for 20000 rules.
6us average latency for 500 rules doesn’t seem to be an issue, but 256us (0.256ms) average latency for 20000 does seem like an issue.
What is also important to note is that these values are the average values for all the shards. Routing decision for some shards are faster than others, depending in which order the rules are written.
In fact, with 20000 rules the average latency for shard_10001
is 0.89us , while the average latency for shard_30000
is 581.1us !
Surely, this doesn’t seem a scalable solution.
MySQL Query Rules Fast Routing
ProxySQL 1.4.7 introduces a new routing capability that enhances what already configurable in mysql_query_rules
.
The new capability is configurable using table mysql_query_rules_fast_routing
. This table is simpler compared to mysql_query_rules
, and its purpose is simple: given username
, schemaname
and flagIN
, finds the destination_hostgroup
.
For reference, this is the table definition of mysql_query_rules_fast_routing
:
CREATE TABLE mysql_query_rules_fast_routing (
username VARCHAR NOT NULL,
schemaname VARCHAR NOT NULL,
flagIN INT NOT NULL DEFAULT 0,
destination_hostgroup INT CHECK (destination_hostgroup >= 0) NOT NULL,
comment VARCHAR NOT NULL,
PRIMARY KEY (username, schemaname, flagIN) )
Table mysql_query_rules_fast_routing
should be considered as an extension of mysql_query_rules
. After processing the rules in mysql_query_rules
, the rules in mysql_query_rules_fast_routing
will be evaluated, unless the last matching rule defined in mysql_query_rules
sets apply=1
.
Because of the nature of the rules in mysql_query_rules_fast_routing
, one and only one rule will be evaluated.
The rules defined in mysql_query_rules_fast_routing
are loaded in a hash table, where the key is username
, schemaname
and FlagIN
, and the value is the destination_hostgroup
.
New routing algorithm in action
This is the command we used to configure mysql_query_rules_fast_routing
with a different number of rules, up to 50000 rules in this example:
(
echo "DELETE FROM mysql_query_rules; DELETE FROM mysql_query_rules_fast_routing;"
echo "INSERT INTO mysql_query_rules (active,username,cache_ttl) VALUES (1,\"sbtest\",120000);"
for i in `seq 10001 60000` ; do
echo "INSERT INTO mysql_query_rules_fast_routing (username,schemaname,flagIN,destination_hostgroup,comment) VALUES (\"sbtest\",\"shard_$i\",0,1,\"\");"
done
echo "LOAD MYSQL QUERY RULES TO RUNTIME;"
) | mysql -u admin -padmin -h 127.0.0.1 -P6032
The results are really interesting:
In other words:
- between 10 rules to 10k rules, the average latency is between 1.7us to 1.9us
- at 20k rules the average latency is 2.2us
- at 50k rules the average latency is 2.4us
What is really interesting is that this new routing algorithm allows ProxySQL to perform schema routing for up to 10000 shards introducing less latency the old routing algorithm was introducing for 100 shards, and it scales very well to 50k rules wth very small overhead compared to 100 shards in the old routing algorithm.
Disabling mysql-stats_time_query_processor by default
In ProxySQL 1.4.4 the new variable mysql-stats_time_query_processor
was introduced to dynamically enable or disable measuring the time spent in the Query Processor. Although timing is very important to understand the impact of having too many rules, measuring time elapse with good precision has some performance impact.
In fact, on this same server where we ran these benchmark, measuring time elapse with very good precision has a cost (latency) of 0.3us , mostly spent in kernel space (system calls).
That also means that from the metrics above you should remove 0.3us to determine the latency when measuring time is not enabled. For example, “at 50k rules the average latency is 2.4us” should become “at 50k rules the average latency is 2.1us” .
For this reason, since ProxySQL 1.4.4 , mysql-stats_time_query_processor
is disabled by default.
Why flagIN is relevant?
The new routing algorithm performs routing based on username
, schemaname
, and flagIN
.
I think some are asking why flagIN
is relevant.
flagIN
in mysql_query_rules_fast_routing
allows to set flagOUT
in mysql_query_rules
based on other criteria later relevant for routing.
For example, assume that you don’t want to only perform routing based on username
+schemaname
, but also read/write split.
You can use mysql_query_rules
to:
- set
flagOUT = 1
for all queries to be sent to the server that is the master, no matter in which cluster - set
flagOUT = 2
for all queries to be sent to the server that is the slave, no matter in which cluster
The value of flagOUT
resulting from mysql_query_rules
will become the flagIN
in mysql_query_rules_fast_routing
. This will allow to combine read/write split to routing based on username
+schemaname
.
For benchmark purpose, let’s create 200.000 rules:
(
echo "DELETE FROM mysql_query_rules; DELETE FROM mysql_query_rules_fast_routing; INSERT INTO mysql_query_rules (active,username,cache_ttl) VALUES (1,\"sbtest\",120000);"
for i in `seq 10001 60000` ; do
echo "INSERT INTO mysql_query_rules_fast_routing (username,schemaname,flagIN,destination_hostgroup,comment) VALUES (\"sbtest\",\"shard_$i\",0,1,\"\");"
done
echo "INSERT INTO mysql_query_rules_fast_routing SELECT username, schemaname, FlagIN+1, destination_hostgroup, comment FROM mysql_query_rules_fast_routing;"
echo "INSERT INTO mysql_query_rules_fast_routing SELECT username, schemaname, FlagIN+2, destination_hostgroup, comment FROM mysql_query_rules_fast_routing;"
echo "LOAD MYSQL QUERY RULES TO RUNTIME;"
) | mysql -u admin -padmin -h 127.0.0.1 -P6032
mysql -u admin -padmin -h 127.0.0.1 -P6032 -e "SELECT COUNT(*) FROM mysql_query_rules_fast_routing"
+----------+
| COUNT(*) |
+----------+
| 200000 |
+----------+
Updating the graph after testing routing based on 200k rules:
As you can see from the graph above, ProxySQL is now able to perform routing decision based on 200k rules with almost no extra impact compared to 100 rules in the legacy algorithm.
Drawback of the new algorithm
This new routing algorithm seems to have a minimum overhead of 1.7us (or 1.4us with mysql-stats_time_query_processor
disabled).
The minimum overhead is not present in the old/legacy algorithm, therefore it makes sense to use the new routing algorithm only if you have more than 100 routing rules.
Furthermore, ProxySQL Cluster capability doesn’t support mysql_query_rules_fast_routing
yet.
As of v2.0.0 mysql_query_rules_fast_routing
is supported also in ProxySQL Cluster mode.
Memory usage
To reduce contention between worker threads in ProxSQL, threads maintain their own copy of query rules. This means that the more rules you have and the more threads you have configured, the more memory is needed. In this example, loading 200k rules and using 4 threads lead to a memory usage of 280MB.
In future we plan to make configurable if, for query processing, ProxySQL should use more memory and be lock free, or use less memory and not be lock free, or a mix of the two options:
mysql_query_rules
to be lock freemysql_query_rules_fast_routing
to be shared between threads
These are implementation details we will be looking into when planning to optimize ProxySQL even further.
Conclusion
ProxySQL 1.4.7 introduces a new routing algorithm that doesn’t replace the legacy algorithm, but enhances it.
Thank to the new routing algorithm, now ProxySQL can easily handle routing based on hundreds of thousands of schemas/shards with almost no impact (few microseconds).