examples/00_quickstart/
directory in the Khora repo holds four self-contained tutorials. They’re the
shortest path from “installed” to “I understand the core loop.” Read them in
order. Each is under ~150 lines and teaches one idea.
Before you run
Every quickstart runs without Postgres, Neo4j, or any external service. The embedded backend (sqlite_lance: SQLite + LanceDB in-process) handles
relational, vector, and graph storage in a single local directory, configured by
examples/khora.embedded.yaml.
You only need the embedded extra and an OpenAI key for the real extraction and
embedding calls:
gpt-4o-mini for extraction and
text-embedding-3-small for embeddings. Override any field with the matching
env var (e.g. KHORA_LLM_MODEL=gpt-4o) without editing the YAML.
These quickstarts run on the embedded
sqlite_lance backend (SQLite + LanceDB
in-process), so they need no external services and stay fast and inexpensive. The
same core loop runs unchanged on the production PostgreSQL + pgvector + Neo4j stack.Remember & recall
The smallest viable program: create a namespace, remember a few facts, recall them. The point is the contrast with the straw-man everyone reaches for first, a keyword scan over a Python list. Ask “What food allergies should I know about?” against the fact “Alice can’t eat peanuts, anaphylaxis” and the keyword scan returns nothing (no shared words). Semantic recall returns the right chunk.recall() returns a RecallResult. Iterate result.chunks, each
carrying .content and .score. Rephrasing the query (“any food-related
medical issues?”) hits the same chunk. That’s semantic retrieval, not string
matching.
Grounded answers & abstention
The most common production-RAG failure isn’t a bad answer. It’s confidently answering when the corpus has nothing to say. Khora surfaces abstention signals so your app can refuse without rolling its own confidence threshold. The displayedchunk.score is a normalized rank within a result. It is not
the right signal for “is this corpus relevant?”. That signal lives in
result.engine_info:
engine_info["max_raw_vector_score"] against a floor (below
~0.3 means nothing on-topic; above ~0.5 is a confident match). Or use the
precomputed engine_info["abstention_signals"] dict (chunks_empty,
top_score_low, should_abstain, …) directly. Abstention signals are emitted by
the VectorCypher engine.
Forget what was wrong
Memories ingested under bad assumptions don’t fix themselves. Re-ingesting a corrected fact doesn’t erase the old one. Both stay retrievable. The supported “I had it wrong” workflow is forget, then re-remember:forget() accepts only a document_id (no fuzzy content match: that’s
racy and overshoots). Store the id returned by remember() alongside the business
object the memory represents. Memories are write-once. There’s no in-place mutation,
which preserves the audit trail.
Namespaces for users
A namespace is Khora’s only tenancy boundary. Every write and every recall is scoped to a namespace. Omitnamespace= and the call errors. This tutorial
proves isolation the way a real cross-tenant leak audit would: bury a unique
secret string (the needle) in Alice’s namespace, then try to extract it from
Bob’s by exact string, by anchor noun, and by confidentiality cue. All three must
miss.
namespace=. Isolation is enforced at the
storage layer, not by post-filtering in Python. So a query against one
namespace can never return another’s chunks, no matter how it’s phrased.
Next steps
api
Core APIs
The next tier: batch ingest, recall filters, ontology config, and reading
the entity graph back.
settings_input_component
VectorCypher
The default engine behind these demos: hybrid vector + graph + keyword
retrieval.