PostgreSQL Extended Query Architecture

This page describes how ProxySQL manages PostgreSQL prepared statements internally. It covers the core challenge that makes transparent proxying non-trivial, the two-tier identifier mapping that solves it, global deduplication via Parse messages, and reference counting. All content on this page is specific to the PostgreSQL Extended Query Protocol as supported in ProxySQL v3.0.3+.

The Proxy Challenge

PostgreSQL prepared statements are stateful: a client sends a Parse message with a statement name and query text, then references that name in subsequent Bind, Execute, Describe, and Close messages. A simple pass-through proxy cannot work here for several interconnected reasons.

Independent statement name namespaces

Unlike MySQL’s numeric statement IDs, PostgreSQL uses string-based statement names chosen by the client (or an empty string for unnamed statements). Every party in the chain — the client, ProxySQL, and each backend server — operates its own independent namespace. A name that is meaningful on one backend connection has no meaning on any other. Two backend connections that have both prepared the same query will each have their own locally scoped name for it.

Connection pooling breaks direct forwarding

ProxySQL multiplexes many client connections onto a shared pool of backend connections. A client that sends a Parse message targeting one backend connection may later send a Bind message that ProxySQL routes to a completely different backend — one that has never processed that Parse. If ProxySQL simply forwarded the client’s statement name unmodified, the backend would return an error because the name is unknown on that connection.

Deduplication opportunity

In workloads where many clients connect to the same PostgreSQL database, the same query text is frequently prepared by multiple sessions independently. Without central deduplication, each client connection would trigger a separate Parse on one or more backends, multiplying parse overhead on the database servers. A shared global registry lets ProxySQL recognise identical statements and serve cached metadata without touching any backend.

Transparent proxying requirement

The client must not need to know it is talking to a proxy. It sends standard PostgreSQL protocol messages and receives standard responses — ParseComplete, BindComplete, ReadyForQuery. All name translation and lazy re-preparation happen invisibly inside ProxySQL.

Two-Tier ID Mapping

ProxySQL resolves these challenges through a two-tier mapping that introduces three distinct levels of identifier.

Client-side statement names

Each client connection has its own independent namespace. When a client sends a Parse message, it chooses a statement name (or uses the empty string for an unnamed statement). ProxySQL records a mapping from that client-supplied name to a process-wide global ID. Two different clients may use the same statement name to refer to entirely different underlying queries; the mapping is per-connection and there is no collision.

Global IDs

A global ID is a process-wide numeric identifier representing one unique statement. There is one global ID per unique statement, where uniqueness is determined by a content hash (described in the next section). Global IDs are the pivot point of the two-tier mapping: client-side names map to global IDs, and global IDs map to backend-side names.

The global registry is the authoritative source of statement metadata — the SQL text and protocol-level parameter type information needed to prepare the statement on a backend that has not yet seen it.

Backend-side statement names

Each backend connection maintains its own namespace. When ProxySQL prepares a statement on a backend connection, it assigns the statement a name of the form proxysql_ps_{n} — where n is unique within that connection but may repeat across different connections. ProxySQL records the mapping from global ID to proxysql_ps_{n} for each backend connection independently, so the same global statement can exist on multiple backend connections simultaneously under different local names.

What this mapping enables

Capability	How the mapping provides it
Connection pooling	A client’s global ID can be translated to whichever backend connection handles the next Bind/Execute, each with its own `proxysql_ps_{n}` name
Deduplication	Many client-side names from many clients all resolve to a single global ID when the underlying query is the same
Lazy preparation	A statement can exist in the global registry before it has been prepared on a given backend; ProxySQL sends a Parse on demand when an Execute is routed there
Transparent proxying	Clients and backends each see only their own identifiers; all translation is internal

Global Deduplication

When a client sends a Parse message, ProxySQL computes a content hash before doing anything else. The hash is derived from:

The SQL query text
The username of the connected client
The current schema (database)
The protocol-level parameter types declared in the Parse message

If the computed hash matches an entry already in the global registry, ProxySQL reuses that entry and returns the cached metadata to the client without sending a new Parse to any backend. The client that triggered the first prepare for a given hash is the one whose request actually travels to the backend; every subsequent client with the same hash receives a ParseComplete response sourced entirely from the cache.

Why user and schema are included

Including the username and schema ensures that two clients who use the same SQL text but connect under different security contexts or to different databases do not share a global entry. A query that is identical in text may resolve table references differently depending on the search_path, or may be subject to different access controls. Treating such cases as separate statements avoids incorrect result sharing.

Why parameter types are included

The PostgreSQL Extended Query Protocol allows clients to declare the data types of bind parameters explicitly in the Parse message. The same SQL text prepared with int4 parameters and again with text parameters will produce statements with different type-checking semantics and potentially different execution plans on the backend. Including parameter types in the hash ensures these are treated as separate global entries.

This means that two clients sending identical SQL with different declared parameter types will each get their own global entry, and the backend will be asked to prepare each independently.

Reference Counting

Each entry in the global registry maintains two independent reference counters.

Client reference count

The client reference count tracks how many client-side name mappings currently point to this global entry — in other words, how many client connections consider this statement open. When a client sends a Close message targeting a statement name, the per-connection client map entry is removed and the client reference count on the corresponding global entry is decremented.

Backend reference count

The backend reference count tracks how many backend connections have this statement prepared — that is, how many entries exist in backend connection maps pointing to this global ID. When a backend connection is destroyed for any reason (idle timeout, server failure, normal shutdown), all of its backend-side mappings are removed and the backend reference count for each referenced global entry is decremented.

Eviction eligibility

A global entry is eligible for eviction only when both reference counts have reached zero. An entry with active client references cannot be evicted because live client connections still expect to be able to execute it. An entry with active backend references cannot be evicted because those backend connections have allocated resources for it. Only when no client holds the statement open and no backend has it prepared is it safe to remove the global entry entirely.

Eviction is performed by a background purge cycle rather than inline with connection teardown, so an entry that becomes eligible may persist for a short period before being reclaimed. See PostgreSQL Connection Pooling, Multiplexing & Caching for details on eviction heuristics and cache size controls.