Skip to main content
The examples/00_quickstart/ directory in the Khora repo holds four self-contained tutorials. They’re the shortest path from “installed” to “I understand the core loop.” Read them in order. Each is under ~150 lines and teaches one idea.

Before you run

Every quickstart runs without Postgres, Neo4j, or any external service. The embedded backend (sqlite_lance: SQLite + LanceDB in-process) handles relational, vector, and graph storage in a single local directory, configured by examples/khora.embedded.yaml. You only need the embedded extra and an OpenAI key for the real extraction and embedding calls:
pip install "khora[sqlite-lance]"
export OPENAI_API_KEY=sk-...

python examples/00_quickstart/01_remember_recall.py
The shared config selects gpt-4o-mini for extraction and text-embedding-3-small for embeddings. Override any field with the matching env var (e.g. KHORA_LLM_MODEL=gpt-4o) without editing the YAML.
These quickstarts run on the embedded sqlite_lance backend (SQLite + LanceDB in-process), so they need no external services and stay fast and inexpensive. The same core loop runs unchanged on the production PostgreSQL + pgvector + Neo4j stack.

Remember & recall

The smallest viable program: create a namespace, remember a few facts, recall them. The point is the contrast with the straw-man everyone reaches for first, a keyword scan over a Python list. Ask “What food allergies should I know about?” against the fact “Alice can’t eat peanuts, anaphylaxis” and the keyword scan returns nothing (no shared words). Semantic recall returns the right chunk.
async with Khora(config, run_migrations=True) as kb:   # default engine: VectorCypher
    namespace = await kb.create_namespace()
    ns_id = namespace.namespace_id

    for fact in facts:
        await kb.remember(
            fact,
            namespace=ns_id,
            entity_types=["PERSON", "CONCEPT", "LOCATION"],
            relationship_types=["RELATES_TO"],
        )

    result = await kb.recall("What food allergies should I know about?", namespace=ns_id)
    for chunk in result.chunks[:3]:
        print(f"[{chunk.score:.2f}] {chunk.content}")
Takeaway: recall() returns a RecallResult. Iterate result.chunks, each carrying .content and .score. Rephrasing the query (“any food-related medical issues?”) hits the same chunk. That’s semantic retrieval, not string matching.

Grounded answers & abstention

The most common production-RAG failure isn’t a bad answer. It’s confidently answering when the corpus has nothing to say. Khora surfaces abstention signals so your app can refuse without rolling its own confidence threshold. The displayed chunk.score is a normalized rank within a result. It is not the right signal for “is this corpus relevant?”. That signal lives in result.engine_info:
result = await kb.recall(question, namespace=ns_id, limit=3)

# Raw pre-rerank cosine of the strongest semantic hit — the right abstention input.
raw_top = result.engine_info.get("max_raw_vector_score", 0.0)

if not result.chunks or raw_top < 0.45:   # tune the floor to your corpus
    print("→ I don't know.")              # abstain
    return
print(f"[raw_top {raw_top:.2f}] {result.chunks[0].content}")
Asked four questions, two in-corpus, one adjacent, one off-topic (“Who won the World Cup in 2022?”), the off-topic queries fall below the floor and abstain. Takeaway: compare engine_info["max_raw_vector_score"] against a floor (below ~0.3 means nothing on-topic; above ~0.5 is a confident match). Or use the precomputed engine_info["abstention_signals"] dict (chunks_empty, top_score_low, should_abstain, …) directly. Abstention signals are emitted by the VectorCypher engine.

Forget what was wrong

Memories ingested under bad assumptions don’t fix themselves. Re-ingesting a corrected fact doesn’t erase the old one. Both stay retrievable. The supported “I had it wrong” workflow is forget, then re-remember:
wrong = await kb.remember(
    "The team standup is at 9:00 AM every weekday.",
    namespace=ns_id,
    title="standup time (wrong)",
    entity_types=["EVENT", "CONCEPT"],
    relationship_types=["RELATES_TO"],
)

# forget() takes the document_id from remember()'s return value — nothing else.
ok = await kb.forget(wrong.document_id, namespace=ns_id)

# Re-ingest the correction as a *separate* document.
await kb.remember("The team standup is at 10:00 AM every weekday.", namespace=ns_id, ...)
Takeaway: forget() accepts only a document_id (no fuzzy content match: that’s racy and overshoots). Store the id returned by remember() alongside the business object the memory represents. Memories are write-once. There’s no in-place mutation, which preserves the audit trail.

Namespaces for users

A namespace is Khora’s only tenancy boundary. Every write and every recall is scoped to a namespace. Omit namespace= and the call errors. This tutorial proves isolation the way a real cross-tenant leak audit would: bury a unique secret string (the needle) in Alice’s namespace, then try to extract it from Bob’s by exact string, by anchor noun, and by confidentiality cue. All three must miss.
alice_ns = (await kb.create_namespace()).namespace_id
bob_ns = (await kb.create_namespace()).namespace_id

await remember_each(kb, alice_ns, alice_facts)   # one fact contains the needle
await remember_each(kb, bob_ns, bob_facts)

# Isolation holds iff the needle never surfaces from Bob's namespace.
for query in [NEEDLE, "internal project codename", "what's confidential about me?"]:
    result = await kb.recall(query, namespace=bob_ns, limit=10)
    assert not any(NEEDLE in c.content for c in result.chunks)
Takeaway: scope every call with namespace=. Isolation is enforced at the storage layer, not by post-filtering in Python. So a query against one namespace can never return another’s chunks, no matter how it’s phrased.

Next steps

api

Core APIs

The next tier: batch ingest, recall filters, ontology config, and reading the entity graph back.
settings_input_component

VectorCypher

The default engine behind these demos: hybrid vector + graph + keyword retrieval.