Skip to content
Navigation

Vector stores persist document chunks alongside their embedding vectors and support similarity search. All implementations conform to the VectorStore ABC.

python
from exo.retrieval import VectorStore, InMemoryVectorStore
from exo.retrieval.backends.chroma import ChromaVectorStore
from exo.retrieval.backends.pgvector import PgVectorStore

VectorStore

python
class VectorStore(abc.ABC)

Abstract base class for vector stores. Subclasses must implement add, search, delete, and clear.

Abstract methods

add

python
async def add(
    self,
    chunks: list[Chunk],
    embeddings: list[list[float]],
) -> None

Add chunks with their embedding vectors.

ParameterTypeDefaultDescription
chunkslist[Chunk](required)The document chunks to store
embeddingslist[list[float]](required)Corresponding embedding vectors (one per chunk)

Raises: ValueError — If the number of chunks and embeddings differ.

python
async def search(
    self,
    query_embedding: list[float],
    top_k: int = 5,
    filter: dict[str, Any] | None = None,
) -> list[RetrievalResult]

Search for the most similar chunks to a query embedding.

ParameterTypeDefaultDescription
query_embeddinglist[float](required)The query vector to compare against
top_kint5Maximum number of results to return
filterdict[str, Any] | NoneNoneOptional metadata filter (exact match on each key)

Returns: A list of RetrievalResult objects ranked by similarity (highest score first).

delete

python
async def delete(self, document_id: str) -> None

Delete all chunks belonging to a document.

ParameterTypeDefaultDescription
document_idstr(required)The ID of the document whose chunks to remove

clear

python
async def clear(self) -> None

Remove all stored chunks and embeddings.


InMemoryVectorStore

python
class InMemoryVectorStore(VectorStore)

In-memory vector store using cosine similarity. Stores chunks and embeddings in plain Python dicts. Suitable for development, testing, and small datasets.

Constructor

python
InMemoryVectorStore()

Takes no parameters.

Methods

Implements all VectorStore abstract methods. Search uses cosine similarity, metadata filters use exact key-value matching.

Example

python
from exo.retrieval import InMemoryVectorStore, Chunk

store = InMemoryVectorStore()

chunks = [
    Chunk(document_id="doc1", index=0, content="Hello world", start=0, end=11),
]
embeddings = [[0.1, 0.2, 0.3]]

await store.add(chunks, embeddings)

results = await store.search([0.1, 0.2, 0.3], top_k=3)
print(results[0].chunk.content)  # "Hello world"

await store.delete("doc1")
await store.clear()

ChromaVectorStore

python
class ChromaVectorStore(VectorStore)

ChromaDB vector store for local persistent or ephemeral vector search. Wraps the ChromaDB Collection API for similarity search using cosine distance.

Requires the chromadb package, included via the chroma extra:

bash
# From the exo-ai monorepo root or packages/exo-retrieval directory
uv sync --extra chroma

Constructor parameters

ParameterTypeDefaultDescription
collection_namestr"exo_vectors"Name of the ChromaDB collection
pathstr | NoneNoneDirectory path for persistent storage. When None, an ephemeral (in-memory) client is used
clientchromadb.ClientAPI | NoneNoneOptional pre-existing ChromaDB client instance

path and client are keyword-only.

Methods

Implements all VectorStore abstract methods. Search converts ChromaDB cosine distances to similarity scores via 1 - distance. Metadata filters are serialized and matched via ChromaDB’s $contains operator.

Example

python
from exo.retrieval.backends.chroma import ChromaVectorStore

# Ephemeral (in-memory)
store = ChromaVectorStore()

# Persistent storage
store = ChromaVectorStore(path="/tmp/chroma_data")

# Custom collection name
store = ChromaVectorStore("my_collection", path="/tmp/chroma_data")

PgVectorStore

python
class PgVectorStore(VectorStore)

PostgreSQL vector store using the pgvector extension. Uses asyncpg for async PostgreSQL access and the <=> cosine distance operator for similarity search.

Requires the asyncpg package, included via the pgvector extra:

bash
# From the exo-ai monorepo root or packages/exo-retrieval directory
uv sync --extra pgvector

Constructor parameters

ParameterTypeDefaultDescription
dsnstr""PostgreSQL connection string (e.g. postgresql://user:pass@host/db)
tablestr"exo_vectors"Name of the table to store vectors in
dimensionsint1536Dimensionality of embedding vectors
poolasyncpg.Pool | NoneNoneOptional pre-existing connection pool

table, dimensions, and pool are keyword-only.

Methods

initialize

python
async def initialize(self) -> None

Create the pgvector extension and table if they don’t exist. Call this once before using the store.

close

python
async def close(self) -> None

Close the connection pool if the store owns it (i.e. no external pool was provided).

Also implements all VectorStore abstract methods (add, search, delete, clear). Search converts pgvector cosine distance to similarity via 1 - distance. Metadata filters use JSONB $>> exact matching.

Example

python
from exo.retrieval.backends.pgvector import PgVectorStore

store = PgVectorStore(
    dsn="postgresql://user:pass@localhost/mydb",
    dimensions=1536,
)
await store.initialize()

# ... add chunks, search, etc.

await store.close()