Namespaces

A namespace is Khora’s sole unit of isolation. Every document, chunk, entity, relationship, and event belongs to exactly one namespace. There’s no organization or workspace hierarchy inside Khora. Higher-level grouping (orgs, teams, users) is the consuming application’s job, usually one namespace per tenant.

Namespaces are required

Every write and every read takes a namespace. There is no default. Omitting it raises ValueError.

async with Khora() as kb:
    ns = await kb.create_namespace()

    await kb.remember(
        "Important content…",
        namespace=ns.namespace_id,
        entity_types=["PERSON", "ORG"],
        relationship_types=["WORKS_AT"],
    )
    results = await kb.recall("what's in there?", namespace=ns.namespace_id)

The dual-ID scheme

A MemoryNamespace carries two UUIDs. Knowing which is which is the key to using the API correctly:

ID	Purpose	Changes?
`namespace_id`	Stable identifier across all versions, hold this in your application	Never
`id`	Row-level primary key (the “version handle”)	Per version (v1, v2, …)

The high-level facade (kb.remember, kb.recall, kb.list_entities) takes the stable namespace_id and resolves it to the active version’s id automatically (one indexed, sub-millisecond lookup). The storage layer (kb.storage.*) takes a row id because each call is scoped to a specific version. Child rows (documents, chunks, entities) foreign-key to id, not namespace_id.

The isolation contract

Isolation isn’t post-filtering in Python. It’s enforced in the query layer and baked into the storage Protocol. Every read, exists-check, and mutation on every backend declares *, namespace_id: UUID as a required keyword-only argument and filters at the SQL WHERE / Cypher MATCH {namespace_id} layer. Looking up an id that belongs to a different namespace returns None / False / an empty result straight from the query. Existence never leaks as a timing oracle:

doc = await kb.get_document(document_id, namespace=caller_ns)
# None if the id doesn't exist OR belongs to another namespace.

neighbourhood = await kb.find_related_entities(entity_id, namespace=caller_ns, max_depth=2)
# Traversal never crosses namespaces — a seed outside caller_ns yields nothing.

The namespaces quickstart example demonstrates this with a cross-tenant “needle” leak test.

Finding namespaces

ns = await kb.get_namespace_by_stable_id(stable_namespace_id)  # by stable id (recommended)
ns = await kb.storage.get_namespace(row_id)                    # by row-level id
page = await kb.storage.list_namespaces()                      # active only by default
namespaces = page.items

Versioning

A namespace can have multiple versions under one stable namespace_id, useful when you want to re-ingest a corpus with a better extractor, A/B-test a chunking change, or snapshot a graph before rebuilding it. kb.storage.create_namespace_version(previous_version=…) increments the version, deactivates the previous version, and creates the new version as active. The swap is a single atomic step:

v1 = await kb.create_namespace()
stable_id = v1.namespace_id

await kb.remember(text, namespace=stable_id, ...)   # lands in v1 (the active version)

# Cut v2. v1 is deactivated; v2 becomes the active version under the same stable id.
v2 = await kb.storage.create_namespace_version(previous_version=v1)
await kb.remember(new_text, namespace=stable_id, ...)   # facade now resolves to v2

# v1's data is untouched and still readable via the storage layer by its row id:
v1_entities = await kb.storage.list_entities(v1.id)

The facade always serves whichever version is active. Older versions are read-only and addressable only through kb.storage.* by their row id. Keep them around for comparison or rollback until you’re confident in the new one.

Cutting a new version activates it immediately and deactivates the old one. The facade starts resolving to the new (initially empty) version right away. So re-ingestion is not invisible to readers: populate the new version promptly, or build it out of band and switch traffic at your own routing layer if you need continuity. To roll back, reactivate the prior version through the storage layer (update_namespace + deactivate_namespace).

See Workloads → Namespace versioning for a runnable end-to-end walkthrough of the dual-UUID model.

Per-namespace configuration

A namespace can override global settings, handy when one dataset needs a different embedding model or stricter thresholds. Overrides take priority over global config:

namespace = await kb.create_namespace(config_overrides={
    "embedding_model": "text-embedding-3-large",
    "embedding_dimension": 3072,
    "min_chunk_similarity": 0.5,
})

Sync checkpoints

Each namespace tracks where it left off syncing from external sources, which is what makes incremental ingestion stateful:

namespace.sync_checkpoints = {
    "slack":  "1706140800",            # Unix timestamp
    "linear": "2024-01-25T00:00:00Z",  # ISO 8601
}

Read the last checkpoint, fetch only newer records, ingest them, then advance the checkpoint via kb.storage.set_sync_checkpoint(namespace_id, source, value).

Storage backends

How the namespace-scoped rows are physically stored and routed.

Data model

What lives inside a namespace: documents, chunks, entities, events.

Getting started

Concepts

Operations

Experimental Features

Integrations

Reference

Examples

Namespaces are required

The dual-ID scheme

The isolation contract

Finding namespaces

Versioning

Per-namespace configuration

Sync checkpoints

Storage backends

Data model

​Namespaces are required

​The dual-ID scheme

​The isolation contract

​Finding namespaces

​Versioning

​Per-namespace configuration

​Sync checkpoints

Storage backends

Data model

Namespaces are required

The dual-ID scheme

The isolation contract

Finding namespaces

Versioning

Per-namespace configuration

Sync checkpoints