A namespace is Khora’s sole unit of isolation. Every document, chunk, entity,
relationship, and event belongs to exactly one namespace. There’s no organization or
workspace hierarchy inside Khora. Higher-level grouping (orgs, teams, users) is the
consuming application’s job, usually one namespace per tenant.
Namespaces are required
Every write and every read takes a namespace. There is no default. Omitting it
raises ValueError.
async with Khora() as kb:
ns = await kb.create_namespace()
await kb.remember(
"Important content…",
namespace=ns.namespace_id,
entity_types=["PERSON", "ORG"],
relationship_types=["WORKS_AT"],
)
results = await kb.recall("what's in there?", namespace=ns.namespace_id)
The dual-ID scheme
A MemoryNamespace carries two UUIDs. Knowing which is which is the key to using the
API correctly:
| ID | Purpose | Changes? |
|---|
namespace_id | Stable identifier across all versions, hold this in your application | Never |
id | Row-level primary key (the “version handle”) | Per version (v1, v2, …) |
The high-level facade (kb.remember, kb.recall, kb.list_entities) takes the
stable namespace_id and resolves it to the active version’s id automatically
(one indexed, sub-millisecond lookup). The storage layer (kb.storage.*) takes a row
id because each call is scoped to a specific version. Child rows (documents,
chunks, entities) foreign-key to id, not namespace_id.
The isolation contract
Isolation isn’t post-filtering in Python. It’s enforced in the query layer and
baked into the storage Protocol. Every read, exists-check, and mutation on every
backend declares *, namespace_id: UUID as a required keyword-only argument and
filters at the SQL WHERE / Cypher MATCH {namespace_id} layer.
Looking up an id that belongs to a different namespace returns None / False /
an empty result straight from the query. Existence never leaks as a timing oracle:
doc = await kb.get_document(document_id, namespace=caller_ns)
# None if the id doesn't exist OR belongs to another namespace.
neighbourhood = await kb.find_related_entities(entity_id, namespace=caller_ns, max_depth=2)
# Traversal never crosses namespaces — a seed outside caller_ns yields nothing.
The namespaces quickstart example demonstrates this with a
cross-tenant “needle” leak test.
Finding namespaces
ns = await kb.get_namespace_by_stable_id(stable_namespace_id) # by stable id (recommended)
ns = await kb.storage.get_namespace(row_id) # by row-level id
page = await kb.storage.list_namespaces() # active only by default
namespaces = page.items
Versioning
A namespace can have multiple versions under one stable namespace_id, useful when
you want to re-ingest a corpus with a better extractor, A/B-test a chunking change, or
snapshot a graph before rebuilding it.
kb.storage.create_namespace_version(previous_version=…) increments the version,
deactivates the previous version, and creates the new version as active. The swap
is a single atomic step:
v1 = await kb.create_namespace()
stable_id = v1.namespace_id
await kb.remember(text, namespace=stable_id, ...) # lands in v1 (the active version)
# Cut v2. v1 is deactivated; v2 becomes the active version under the same stable id.
v2 = await kb.storage.create_namespace_version(previous_version=v1)
await kb.remember(new_text, namespace=stable_id, ...) # facade now resolves to v2
# v1's data is untouched and still readable via the storage layer by its row id:
v1_entities = await kb.storage.list_entities(v1.id)
The facade always serves whichever version is active. Older versions are read-only and
addressable only through kb.storage.* by their row id. Keep them around for
comparison or rollback until you’re confident in the new one.
Cutting a new version activates it immediately and deactivates the old one. The
facade starts resolving to the new (initially empty) version right away. So
re-ingestion is not invisible to readers: populate the new version promptly, or
build it out of band and switch traffic at your own routing layer if you need
continuity. To roll back, reactivate the prior version through the storage layer
(update_namespace + deactivate_namespace).
See Workloads → Namespace versioning for a runnable
end-to-end walkthrough of the dual-UUID model.
Per-namespace configuration
A namespace can override global settings, handy when one dataset needs a different
embedding model or stricter thresholds. Overrides take priority over global config:
namespace = await kb.create_namespace(config_overrides={
"embedding_model": "text-embedding-3-large",
"embedding_dimension": 3072,
"min_chunk_similarity": 0.5,
})
Sync checkpoints
Each namespace tracks where it left off syncing from external sources, which is what
makes incremental ingestion stateful:
namespace.sync_checkpoints = {
"slack": "1706140800", # Unix timestamp
"linear": "2024-01-25T00:00:00Z", # ISO 8601
}
Read the last checkpoint, fetch only newer records, ingest them, then advance the
checkpoint via kb.storage.set_sync_checkpoint(namespace_id, source, value).
database
Storage backends
How the namespace-scoped rows are physically stored and routed.
schema
Data model
What lives inside a namespace: documents, chunks, entities, events.