API reference

Everything on this page is part of Khora’s stable public API, pinned by __all__ in khora/__init__.py. Additive changes land in minor releases; breaking changes require a major bump. Private imports (khora.engines.*, khora.query.engine, khora.pipelines.flows) are not stable.

from khora import (
    Khora, KhoraConfig, KhoraError, SearchMode,
    RememberResult, RecallResult, BatchResult, BatchHandle, DocumentResult, Stats, LLMUsage,
    DocumentSource, EventType, SemanticFilter, context_text,
    create_engine, list_engines, register_engine,
    ExpertiseConfig, EntityTypeConfig, RelationshipTypeConfig,
)

`Khora`

The primary facade. Delegates to a pluggable engine (default vectorcypher).

Khora(
    database_url: str | KhoraConfig | None = None,
    *,
    engine: str = "vectorcypher",
    graph_url: str | None = None,
    embedding_model: str = "text-embedding-3-small",
    engine_kwargs: dict[str, Any] | None = None,
    run_migrations: bool = False,
)

Pass a PostgreSQL URL or a full KhoraConfig, or nothing, to read KHORA_DATABASE_URL / KHORA_NEO4J_URL. run_migrations=True runs Alembic under an advisory lock on connect. Credential fields are pydantic.SecretStr (rendered as '**********', call .get_secret_value() to read).

async with Khora(...) as kb:      # connect() / disconnect() automatic
    ...

# or manually:
kb = Khora(...); await kb.connect()
try: ...
finally: await kb.disconnect()

await Khora.shared() returns a process-wide, already-connected singleton (created and connected on first call, reused after). Callers must not disconnect() it: its lifetime is the process. await Khora.shared.clear() drops the cached instance and is test-only.

Namespaces

ns     = await kb.create_namespace(*, config_overrides=None, metadata=None)   # MemoryNamespace
ns     = await kb.get_namespace(namespace_id: UUID)                     # | None
ns     = await kb.get_namespace_by_stable_id(namespace_id: str | UUID)  # stable-id lookup
result = await kb.delete_namespace(namespace: str | UUID)               # NamespaceDeletionResult

create_namespace is keyword-only, no positional name. Use ns.namespace_id (the stable public id) everywhere below, not the row-level ns.id. See Namespaces & isolation. delete_namespace is the inverse: it cascade-removes the namespace’s documents, chunks, entities, relationships, graph nodes, and vector rows from every backend, then drops the namespace row(s) and frees the storage. namespace can be the stable namespace_id or any version’s row id, and all versions under that stable id go together, whether the namespace is active or deactivated. Afterwards it no longer lists or resolves, and recall / stats / forget on it raise ValueError. A cross-backend delete can partially fail (say the graph backend is unreachable). Rather than raise, the failing backend is recorded as a degradation on the result and the rest still run, so check result.partial_failure to catch a half-deleted namespace. It is safe to re-run.

Writing

`remember`

result: RememberResult = await kb.remember(
    content: str,
    *,
    namespace: str | UUID,
    title: str = "", source: str = "", source_type: str = "library",
    source_name: str | None = None, source_url: str | None = None,
    source_timestamp: datetime | None = None,   # event time → occurred_at, feeds recency
    metadata: dict | None = None,
    entity_types: list[str],            # required
    relationship_types: list[str],      # required
    expertise: ExpertiseConfig | str | None = None,   # config, or a registered name / YAML path
    chunk_strategy: ChunkStrategy | None = None,   # "fixed" | "semantic" | "recursive" | "conversation"
    chunk_size: int | None = None,      # None → KHORA_PIPELINES_CHUNK_SIZE
    extraction_config_hash: str | None = None,   # cache key to reuse a prior extraction config
    external_id: str | None = None,     # None or non-blank ≤512 chars
    session_id: UUID | None = None,
)

Ingests through the three-phase pipeline (see Ingestion). metadata and the provenance kwargs (source_name, source_type, …) are denormalized onto chunks and become filterable at recall (see Recall filters). session_id propagates to the document and its chunks for session-scoped recall and forget_session.

`remember_batch`

result: BatchResult = await kb.remember_batch(
    documents: list[dict],
    *,
    namespace: str | UUID,
    max_concurrent: int = 10,
    deduplicate: bool = True,
    infer_relationships: bool = True,
    on_progress: Callable[[int, int], None] | None = None,
    entity_types: list[str], relationship_types: list[str],
    expertise: ExpertiseConfig | str | None = None,
    chunk_strategy: ChunkStrategy | None = None,
    chunk_size: int | None = None,
    extraction_batch_size: int | None = None,   # chunks per extraction LLM call
    extraction_max_tokens: int | None = None,    # cap output tokens per extraction call
    # ... shared provenance kwargs as remember() (source_timestamp, extraction_config_hash, ...)
)

Concurrent ingestion with cross-document dedup. Each dict accepts the same per-document fields as remember(), and per-doc values override the top-level kwargs.

`submit_batch`

handle: BatchHandle = await kb.submit_batch(
    documents: list[dict],
    *,
    on_result: Callable[[int, int, DocumentResult], None],
    namespace: str | UUID,
    entity_types: list[str], relationship_types: list[str],
    expertise: ExpertiseConfig | None = None,
    chunk_strategy: ChunkStrategy | None = None,
    max_chunks_in_flight: int | None = None,   # cap chunks buffered across the batch
    extraction_config_hash: str | None = None,
    max_concurrent: int = 20,
    reprocess_archived: bool = False,
    session_id: UUID | None = None,
    # ... shared provenance kwargs (source_timestamp, source_name, ...)
)

Deferred ingestion: stages every doc as PENDING, returns immediately. Requires kb.start_pending_processor() (after connect()) or it raises. await handle.wait() blocks until every document’s on_result has fired.

Reading

`recall`

result: RecallResult = await kb.recall(
    query: str,
    *,
    namespace: str | UUID,
    limit: int = 10,
    mode: SearchMode = SearchMode.HYBRID,
    min_similarity: float = 0.0,
    start_time: datetime | None = None,
    end_time: datetime | None = None,
    filter: RecallFilter | dict[str, Any] | None = None,
)

filter= is a deterministic RecallFilter (or its dict form) applied as a hard predicate alongside the ranking. start_time / end_time are deprecated in favor of filter={"occurred_at": {...}} and cannot be combined with it. Fusion weights, reranking, HyDE, and recency are global (KhoraConfig.query / KHORA_QUERY_*). There’s no config= kwarg. See Retrieval and Recall filters.

`context_text`

from khora import context_text
text: str = context_text(result: RecallResult, *, max_chunks: int = 5)

Renders a RecallResult into a flat LLM-context string (chunks grouped by document title, then --- Entities --- and --- Relationships --- sections).

Entity & document reads

entity    = await kb.get_entity(entity_id, namespace=ns.namespace_id, include_sources=False)  # Entity | None
document  = await kb.get_document(document_id, namespace=ns.namespace_id)     # Document | None
entities  = await kb.list_entities(namespace=ns.namespace_id, entity_type=None, limit=100, include_sources=False)
documents = await kb.list_documents(namespace=ns.namespace_id, limit=100)     # list[Document]
matches   = await kb.search_entities(query, namespace=ns.namespace_id, limit=10, include_sources=False)  # list[Entity]
related   = await kb.find_related_entities(entity_id, namespace=ns.namespace_id, max_depth=2, limit=20, include_sources=False)
stats     = await kb.stats(namespace=ns.namespace_id)                        # Stats

namespace is required on these (accepts str | UUID). Cross-namespace ids resolve to None / empty rather than leaking the foreign row. The isolation contract holds at every layer (see Namespaces & isolation). search_entities ranks entities by embedding similarity to query. On the reads that take include_sources, pass True to populate each entity’s source-document metadata (an extra lookup, off by default).

Community reads

communities = await kb.get_communities(namespace=ns.namespace_id, limit=100, offset=0)
communities = await kb.get_entity_communities(entity_ids, namespace=ns.namespace_id)

Both return list[CommunityNode] (a summary string plus member_ids), the community summaries the dream phase materializes into the graph. get_communities lists a namespace’s communities; get_entity_communities returns the ones a given set of entities belong to. Read-only, and empty on a stack without a graph backend or without materialized communities.

Deleting

removed: bool = await kb.forget(document_id: UUID, *, namespace: str | UUID)
deleted: int  = await kb.forget_session(namespace_id: UUID, session_id: UUID)

forget_session cascade-deletes every document tagged with session_id (chunks via FK cascade, graph cleanup via the engine). For TTL cleanup, the opt-in helper khora.gc.expire_sessions(*, kb, before, namespace_id=None) calls forget_session for each session whose newest document predates before. Khora runs no scheduler. Call it from your own loop.

Background processing & health

kb.start_pending_processor()          # start the PENDING-document worker (idempotent)
await kb.stop_pending_processor()     # cancel the worker; restartable
status = await kb.health_check()      # component-health dict

submit_batch needs the pending processor running: call start_pending_processor() after connect() on services that write documents (read-only services skip it), and stop_pending_processor() to cancel the background worker (it can be restarted). health_check() returns a per-component health dict, or {"status": "disconnected"} before connect().

Result types

All result types are frozen, slotted dataclasses. RememberResult: document_id, namespace_id, chunks_created, entities_extracted, relationships_created, relationships_skipped (un-remappable edges the ingest pipeline dropped, always 0 outside the shared pipeline), metadata, llm_usage. BatchResult: total / processed / skipped / failed, chunks / entities / relationships, metadata, llm_usage, per_document (one entry per submitted document in input order, mapping each back to its stored document_id; populated by VectorCypher, may be empty on other engines). BatchHandle: batch_id, total, the read-only properties completed / failed / is_done, and await handle.wait(). DocumentResult (per-doc on_result payload): document_id, namespace_id, success, error, per-doc counts, llm_usage, skipped, external_id (the caller’s Document.external_id, for mapping a result back to its source row). Stats: documents / chunks / entities / relationships, last_activity_at. NamespaceDeletionResult (from delete_namespace): namespace_id, removed_row_ids, the removed counts namespaces_removed / documents_removed / chunks_removed / vector_rows_removed / graph_nodes_removed, degradations, and the partial_failure property (True when any backend purge failed). CommunityNode (from get_communities / get_entity_communities): a materialized dream community: id, namespace_id, summary, member_ids, summary_depth, and an optional embedding. UsageSummary (from khora import UsageSummary): an aggregate over a list of LLMUsage. Carries total_prompt_tokens, total_completion_tokens, total_tokens, total_cost_usd, total_latency_ms, and the by_operation / by_model breakdowns. Build one with UsageSummary.from_usage(result.llm_usage).

`RecallResult`

Field	Type	Notes
`query`	`str`	The original query
`namespace_id`	`UUID`	Namespace searched
`chunks`	`list[RecallChunk]`	Scored chunks (`score` is a typed field)
`entities`	`list[RecallEntity]`	Scored entities with provenance ids
`relationships`	`list[RecallRelationship]`	Connections between entities from VectorCypher’s graph traversal
`documents`	`list[DocumentProjection]`	Deduplicated source docs
`engine_info`	`dict`	Engine telemetry: always carries `"engine"`, plus `max_raw_vector_score`. On a filtered recall it also carries `"filter"`, a pushdown report.
`llm_usage`	`list[LLMUsage]`	Token usage incurred during the recall.
`communities`	`list[CommunityNode]`	Dream-phase community summaries the matched entities belong to. Empty on stacks without materialized communities.

Producer invariant: every chunks[i].document_id and every id in entities[i].source_document_ids / relationships[i].source_document_ids appears in documents[]. RecallChunk carries id, document_id, content, score, created_at, occurred_at, connected_entity_ids, chunker_info.

`engine_info["filter"]`

On a recall(filter=...), engine_info["filter"] carries a FilterPushdownReport: an honest account of how the filter was handled. Both it and the per-channel FilterChannelReport are public (exported from khora).

Field	Type	Notes
`pushed_down`	`bool`	`True` only when the filter is fully pushed: every constraint leaf landed in the backend query and nothing was re-checked in memory.
`post_filtered`	`bool`	`True` when any leaf was re-checked in memory, or a channel ran a defensive full-predicate re-check.
`pushed_keys`	`list[str]`	Dotted leaf keys pushed into the backend query on every gating channel (sorted).
`post_filtered_keys`	`list[str]`	Dotted leaf keys re-checked in memory on at least one gating channel (sorted).
`unenforced_keys`	`list[str]`	Dotted leaf keys no channel enforced. Empty on a correct recall; a non-empty value signals silent under-enforcement.
`channels`	`dict[str, FilterChannelReport]`	Per-channel breakdown; each entry carries that channel’s own `pushed_keys` / `post_filtered_keys`.

pushed_keys, post_filtered_keys, and unenforced_keys form a total, disjoint partition of the filter’s constraint leaves. See Recall filters for the full filter grammar.

`SearchMode`

SearchMode.VECTOR    # pgvector / HNSW only
SearchMode.GRAPH     # Cypher / graph traversal only
SearchMode.KEYWORD   # BM25 / full-text only (always runs the BM25 channel)
SearchMode.HYBRID    # vector + graph, fused via RRF (default); BM25 joins only when enable_bm25_channel is on
SearchMode.ALL       # like HYBRID, but weights BM25 equally once the keyword channel is enabled

Engines

from khora import create_engine, list_engines, register_engine

list_engines()                                # registered engine names
register_engine("my_engine", "my.module", "MyEngineClass")   # lazy registration

vectorcypher is the default and the engine these docs cover. Prefer the engine= argument to Khora(...) over create_engine directly. Custom engines must implement the full MemoryEngineProtocol. See VectorCypher.

Expertise

ExpertiseConfig (a stable public API) defines a domain ontology, with entity/relationship types plus a system prompt, correlation rules, and inference rules. See Expertise & ontologies for the full guide:

from khora import ExpertiseConfig, EntityTypeConfig, RelationshipTypeConfig

expertise = ExpertiseConfig(
    name="medical_research",
    entity_types=[EntityTypeConfig(name="DRUG", description="..."), ...],
    relationship_types=[RelationshipTypeConfig(name="TREATS", description="..."), ...],
)
await kb.remember(content, namespace=ns.namespace_id, expertise=expertise,
                  entity_types=expertise.get_entity_type_names(),
                  relationship_types=expertise.get_relationship_type_names())

Hooks & errors

kb.subscribe(event_type, callback, filter=None) / kb.unsubscribe(id) / kb.hooks. For delivery that survives a restart, kb.subscribe_persistent(event_type, delivery, *, filter=None, namespace_id=None) / kb.unsubscribe_persistent(id) record a webhook or queue target to PostgreSQL. See Semantic hooks. All domain errors subclass KhoraError. Catch it at system boundaries.

from khora import KhoraError
try:
    await kb.remember(...)
except KhoraError as exc:
    ...

Ingestion

What remember() / remember_batch() / submit_batch() do under the hood.

Retrieval

What recall() does and how to read a RecallResult.

Getting started

Concepts

Operations

Experimental Features

Integrations

Reference

Examples

API reference

`Khora`

Namespaces

Writing

`remember`

`remember_batch`

`submit_batch`

Reading

`recall`

`context_text`

Entity & document reads

Community reads

Deleting

Background processing & health

Result types

`RecallResult`

`engine_info["filter"]`

`SearchMode`

Engines

Expertise

Hooks & errors

Ingestion

Retrieval

​Khora

​Namespaces

​Writing

​remember

​remember_batch

​submit_batch

​Reading

​recall

​context_text

​Entity & document reads

​Community reads

​Deleting

​Background processing & health

​Result types

​RecallResult

​engine_info["filter"]

​SearchMode

​Engines

​Expertise

​Hooks & errors

Ingestion

Retrieval

`Khora`

Namespaces

Writing

`remember`

`remember_batch`

`submit_batch`

Reading

`recall`

`context_text`

Entity & document reads

Community reads

Deleting

Background processing & health

Result types

`RecallResult`

`engine_info["filter"]`

`SearchMode`

Engines

Expertise

Hooks & errors