RAG Tools (/mcp/rag)
See RAG Overview for an introduction to ProxySQL’s RAG capabilities, architecture, and use cases.
The tools available at the /mcp/rag endpoint are designed for Retrieval Augmented Generation (RAG) workflows, including advanced search and source data retrieval.
Search Tools
| Tool | Description |
|---|---|
rag.search_fts | Performs a full-text search across indexed documents. |
rag.search_vector | Executes a vector-based semantic search. |
rag.search_hybrid | Combines full-text and vector search for more relevant results. |
rag.search_fts
Performs a full-text search across all indexed document chunks using SQLite FTS5.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | yes | The search query string. Standard FTS5 query syntax is supported (e.g., phrase matching with "...", prefix matching with *). |
limit | int | no | Maximum number of chunks to return. Defaults to 10. |
Returns: An array of matching document chunks ranked by BM25 relevance score, each including the chunk text, source document identifier, and score.
rag.search_vector
Executes a semantic similarity search using pre-computed vector embeddings. Useful for finding conceptually related content even when exact keywords do not match.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | yes | Natural-language query. The query is embedded on the fly and compared against the stored embeddings. |
limit | int | no | Maximum number of chunks to return. Defaults to 10. |
Returns: An array of semantically similar document chunks ordered by cosine similarity, each including the chunk text, source document identifier, and similarity score.
rag.search_hybrid
Combines full-text and vector search results using Reciprocal Rank Fusion (RRF), giving the best of both lexical and semantic retrieval.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | yes | The search query. Applied to both FTS and vector search internally. |
limit | int | no | Maximum number of chunks to return after re-ranking. Defaults to 10. |
Returns: An array of re-ranked document chunks with a fused relevance score.
Data Retrieval Tools
| Tool | Description |
|---|---|
rag.get_chunks | Retrieves specific chunks of text from indexed documents. |
rag.get_docs | Fetches metadata or full content for specified documents. |
rag.fetch_from_source | Retrieves the original source data for a given document or chunk. |
rag.get_chunks
Retrieves one or more specific chunks by their identifier, without performing a search.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
chunk_ids | array of strings | yes | One or more chunk IDs to fetch. |
Returns: An array of chunk objects, each containing the chunk text, position within the parent document, and source document metadata.
rag.get_docs
Fetches metadata or the full reconstructed content for one or more documents.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
doc_ids | array of strings | yes | One or more document IDs to retrieve. |
include_chunks | bool | no | If true, includes all chunks for each document in the response. Defaults to false. |
Returns: An array of document objects with title, source URL, ingestion timestamp, and optionally all chunks.
rag.fetch_from_source
Re-fetches the original source content for a document (e.g., re-downloads a URL or re-reads a file), bypassing the local index. Useful for verifying that indexed content is still current.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
doc_id | string | yes | The document ID whose original source should be fetched. |
Returns: The raw source content as a string and metadata about the source location.
Management & Monitoring
| Tool | Description |
|---|---|
rag.admin.stats | Provides detailed statistics on RAG index usage and performance. |
rag.admin.stats
Returns operational statistics for the RAG index.
Parameters: None.
Returns: A JSON object containing: total document count, total chunk count, index size on disk, number of searches performed (broken down by FTS, vector, and hybrid), average query latency, and cache hit rate.