MySQL Prepared Statements Architecture

This page describes how ProxySQL manages MySQL prepared statements internally. It covers the core challenge that makes transparent proxying of prepared statements non-trivial, the two-tier identifier mapping that solves it, global deduplication, and reference counting.

The Proxy Challenge

Prepared statements present a unique problem for a proxy because they are stateful: a client prepares a statement once and receives a numeric identifier, then references that identifier in subsequent execute and close messages. A simple pass-through proxy cannot work here for several interconnected reasons.

Independent ID namespaces

Every party in the chain — the client, ProxySQL, and each backend MySQL server — assigns its own numeric identifier to a prepared statement. The identifier a MySQL server assigns when it processes COM_STMT_PREPARE has meaning only on that specific backend connection. If the same query is prepared on two different backend connections, each connection will assign a completely independent numeric ID. A client’s identifier has no meaning to a backend, and vice versa.

Connection pooling breaks direct forwarding

ProxySQL multiplexes many client connections onto a shared pool of backend connections. A client that prepares a statement on one backend connection may later execute that statement on a different backend connection — one that has never seen the prepare message. If ProxySQL simply forwarded the client’s numeric identifier to whichever backend is available, execution would fail or produce incorrect results.

Deduplication opportunity

In practice, many clients connecting to the same database will prepare the same queries. Without deduplication, each client connection would independently prepare the same statement on each backend, multiplying work on the MySQL servers. A central registry allows ProxySQL to recognise when a new prepare request is identical to one already known, and reuse the existing metadata.

Transparent proxying requirement

The client should not need to know it is talking to a proxy. It sends standard MySQL protocol messages and receives standard responses. All identifier translation and lazy re-preparation on backends must happen invisibly inside ProxySQL.

Two-Tier ID Mapping

ProxySQL resolves these challenges through a two-tier mapping that introduces three distinct levels of identifier.

Client-side IDs

Each client connection has its own independent namespace. When a client prepares a statement, ProxySQL assigns a numeric identifier that is meaningful only within that client’s session. Two different clients may hold identifiers with the same numeric value that refer to entirely different statements, or to the same underlying statement — the client cannot tell the difference.

Global IDs

A global identifier is process-wide and represents a unique statement. There is one global ID per unique statement, where uniqueness is determined by a content hash (described in the next section). Global IDs are the pivot point of the mapping: client-side IDs map to global IDs, and global IDs map to backend-side IDs.

The global registry is the source of truth for statement metadata — the SQL text and the protocol-level parameter information needed to prepare the statement on a backend that has not yet seen it.

Backend-side IDs

Each backend connection maintains its own numeric namespace. When ProxySQL prepares a statement on a backend by sending COM_STMT_PREPARE, the MySQL server assigns a numeric statement ID scoped to that connection. ProxySQL records the mapping from the global ID to the backend-specific numeric ID for each backend connection independently.

What this mapping enables

Capability	How the mapping provides it
Connection pooling	A client’s global ID can be translated to whichever backend connection handles the next execute, each with its own backend numeric ID
Deduplication	Many client-side IDs from many clients all resolve to a single global ID when the underlying query is the same
Lazy preparation	A statement can exist in the global registry before it has been prepared on a given backend; ProxySQL sends `COM_STMT_PREPARE` on demand when an execute is routed there
Transparent proxying	Clients and backends each see only their own numeric identifiers; all translation is internal

Global Deduplication

When a client prepares a statement, ProxySQL computes a content hash before doing anything else. The hash is derived from:

The SQL query text
The username of the connected client
The current schema (database)

If the computed hash matches an entry already in the global registry, ProxySQL reuses that entry and returns the cached metadata to the client without sending a new COM_STMT_PREPARE to any backend. The client that triggered the first prepare for a given hash is the one whose request actually travels to the backend; every subsequent client with the same hash benefits from the already-prepared statement.

Why user and schema are included

Including the username and schema in the hash ensures that two clients who happen to use the same SQL text but connect under different security contexts or to different databases do not share a global entry. A query that is identical in text but runs in a different schema might resolve table references differently, or be subject to different access controls. Treating it as a separate statement avoids incorrect result sharing.

Reference Counting

Each entry in the global registry maintains two independent reference counters.

Client reference count

The client reference count tracks how many client-side IDs currently map to this global entry — in other words, how many client connections consider this statement “open.” When a client sends COM_STMT_CLOSE for a prepared statement, its client-side mapping is removed and the client reference count for the corresponding global entry is decremented.

Backend reference count

The backend reference count tracks how many backend connections have this statement prepared — that is, how many entries exist in backend connection maps pointing to this global ID. When a backend connection is destroyed (for any reason: idle timeout, server failure, or normal shutdown), all of its backend-side mappings are removed and the backend reference count for each referenced global entry is decremented.

Eviction eligibility

A global entry is eligible for eviction only when both reference counts have reached zero. An entry with active client references cannot be evicted because live client connections still expect to be able to execute it. An entry with active backend references cannot be evicted because those backend connections have resources allocated for it. Only when no client holds the statement open and no backend has it prepared is it safe to remove the global entry entirely.

Routing Note (v2.1.0+)

Before v2.1.0, routing metadata for a prepared statement was cached at prepare time and reused at execute time. This caused a statement to always execute in the hostgroup where it was originally prepared, which could produce unexpected behaviour — for example, a statement prepared against a reader hostgroup would continue to execute there even when called inside a write transaction.

Since v2.1.0, ProxySQL no longer caches routing information at prepare time. Instead, query rules are re-evaluated at COM_STMT_EXECUTE time using the statement’s SQL text and the current session context. This ensures that routing decisions reflect the actual state of the session at execution time rather than the state at preparation time.

This behaviour was further refined in v2.2.0 to correctly handle query annotations such as hostgroup hints: the first_comment of a prepared statement is now included when computing routing for COM_STMT_EXECUTE, so statements that carry an explicit hostgroup annotation are routed to the requested hostgroup during execution as well as during preparation.