KHORA_ or a KhoraConfig instance constructed programmatically. Both paths are backed by the same pydantic-settings model in src/khora/config/schema.py.
Two ways to configure
Environment variables
All settings use theKHORA_ prefix with single-underscore separators for nested fields. Examples:
KHORA_STORAGE__GRAPH__URL) is still accepted as a backwards-compatible alias on every nested-config field. New code and .env files should use the single-underscore form shown throughout this document. The legacy form continues to work but is no longer documented.
Nested-object env vars (graph backend, vector backend, dream-phase per-op toggles) are documented in the Nested env vars section below.
Programmatic
Install extras
| Extra | Purpose | Pulls in |
|---|---|---|
| (default) | Core: PostgreSQL + pgvector + Neo4j driver + litellm | - |
sqlite | SQLite embedded relational + vector | aiosqlite>=0.21.0 |
lancedb | LanceDB embedded vector store | lancedb>=0.30.0, pyarrow>=24.0.0 |
sqlite-lance | Unified SQLite + LanceDB embedded backend, the recommended embedded stack for VectorCypher | lancedb>=0.30.0, aiosqlite>=0.21.0, pyarrow>=24.0.0 |
binary-readers | docx / xlsx readers (used by downstream ingestors) | openpyxl>=3.1.0, python-docx>=1.2.0 |
parquet | Parquet readers | pyarrow>=24.0.0 |
accel | Accelerated CPU ops (string-matching fuzz, used by dream-phase centroid recompute) | rapidfuzz>=3.0.0 |
nlp | spaCy-based sentence splitting | spacy>=3.8.0 |
otel | OpenTelemetry SDK + OTLP/HTTP exporter (vendor-neutral) | opentelemetry-sdk>=1.34.1, opentelemetry-exporter-otlp-proto-http>=1.34.1 |
otel-grpc | khora[otel] + OTLP/gRPC transport | adds opentelemetry-exporter-otlp-proto-grpc>=1.34.1 |
logfire | Logfire - managed OTel backend with auto-bootstrap | logfire>=4.6.0 |
rust | Rust acceleration (khora-accel); pin tracks khora’s own version in lockstep | khora-accel (exact lockstep pin) |
pip install 'khora[rust,otel]'. See Observability for the full description of open telemetry env-var contract, precedence rules, and vendor recipes. Khora always exposes the OTel API. The [otel] and [logfire] extras determine where spans/metrics go.
Core settings
| Variable | Type | Default | Description |
|---|---|---|---|
KHORA_DATABASE_URL | str | - | PostgreSQL URL (shortcut for storage.postgresql_url). |
KHORA_NEO4J_URL | str | - | Neo4j URL (shortcut for storage.graph.url). |
KHORA_LLM_EXTRACTION_MODEL | str | - | Override extraction model (shortcut for llm.extraction_model). |
KHORA_DEBUG | bool | false | Enable debug-level logging. |
KHORA_ENVIRONMENT | str | development | development, staging, or production. |
KHORA_APP_NAME | str | khora | Used in logs and telemetry. |
Storage
Prefix:KHORA_STORAGE_. See Storage backends for the full backend matrix.
| Variable | Default | Description |
|---|---|---|
KHORA_STORAGE_BACKEND | postgres | postgres (PostgreSQL + pgvector + Neo4j) or sqlite_lance (SQLite + LanceDB embedded). |
KHORA_STORAGE_POSTGRESQL_URL | - | PostgreSQL connection URL. |
KHORA_STORAGE_POSTGRESQL_POOL_SIZE | 50 | asyncpg pool size. |
KHORA_STORAGE_POSTGRESQL_MAX_OVERFLOW | 30 | Max overflow connections. |
KHORA_STORAGE_POSTGRESQL_POOL_PRE_PING | false | Validate connections before checkout (adds latency, prevents stale-connection errors). |
KHORA_STORAGE_HNSW_M | 24 | HNSW index M (max connections per layer). |
KHORA_STORAGE_HNSW_EF_CONSTRUCTION | 128 | Build-time HNSW search width. |
KHORA_STORAGE_HNSW_EF_SEARCH | 100 | Query-time HNSW search width. |
KHORA_STORAGE_USE_HALFVEC | true | Use halfvec (float16) for HNSW indexes. Requires pgvector >= 0.7.0; falls back gracefully. |
storage.graph and storage.vector. The flat fields KHORA_STORAGE_NEO4J_URL, KHORA_STORAGE_NEO4J_USER, KHORA_STORAGE_NEO4J_PASSWORD, KHORA_STORAGE_PGVECTOR_URL, and KHORA_STORAGE_EMBEDDING_DIMENSION remain supported as a back-compat path and are migrated into the discriminated-union configs automatically.
Neo4j pool metrics
With any OTel backend installed ([otel] or [logfire]), the Neo4j backend emits OTel metrics automatically. See Observability. For high-frequency sub-minute sampling enable:
Neo4j relationship limits
Relationship.source_document_ids and Relationship.source_chunk_ids are append-bounded on every MERGE to prevent unbounded growth on hot edges. Defaults are 100 and 250 respectively. For deep-provenance workloads, where many documents contribute to the same edge, raise the relevant knob and watch the khora.neo4j.relationship.source_id_truncated metric:
logger.warning(...) records the field name, dropped count, rows affected, and configured limit.
Embedded backends
The embeddedsqlite_lance path is appropriate for demos, evaluation, tests, and small single-user CLIs. It is not the deployment story. For production, use PostgreSQL + pgvector + Neo4j.
Documented scale ceiling, where performance and recall degrade noticeably above these thresholds:
- ~1M chunks (LanceDB IVF-PQ training time + write serialisation start to dominate)
- ~100k entities (recursive-CTE traversal cost on hub nodes)
- ~500k relationships
- Traversal depth ≤3 (the
instr(walk.visited, ...)visited-set scan ingraph.pyisO(depth × fan-out × visited-len)and degrades sharply at depth ≥4 with high fan-out)
- Partial atomicity in
coordinator.transaction(): only the SQL session is enrolled. LanceDB writes happen post-commit with compensating-delete-on-failure. A crash between SQLite commit and Lance write can leave orphaned vectors or missing embeddings, and reconciliation runs on the next ingest. - Point-in-time queries are not supported on the embedded stack.
target_datequeries raiseNotImplementedError. The bi-temporal entity versioning that powers them lives in Neo4j (version_valid_from/version_valid_toon:Entity/:EntityVersionnodes), whichsqlite_lancehas no equivalent of. - FTS5 covers chunks only: entity-anchored recall falls back to
LIKE/ JSON-equality. Recommend the PostgreSQL stack for entity-heavy corpora. - Install footprint is ~130–180 MB unpacked (pyarrow + lancedb native + Arrow C++ runtime). “Embedded” means “no server”, not “no native deps”.
- IVF-PQ retraining is automatic when the corpus grows past
retrain_factor × (rows at last training). Tune viaKHORA_STORAGE_SQLITE_LANCE_RETRAIN_FACTOR.
sqlite_lance storage sub-config. See the KHORA_STORAGE_SQLITE_LANCE_* table below for DB_PATH, LANCE_PATH, EMBEDDING_DIMENSION, USE_HALFVEC, LANCE_INDEX, IVF_PARTITIONS, HNSW_M, and RETRAIN_FACTOR with defaults and tuning guidance.
LLM
Prefix:KHORA_LLM_. LiteLLM handles the provider dispatch.
| Variable | Default | Description |
|---|---|---|
KHORA_LLM_MODEL | gpt-4o-mini | Primary model for generation. |
KHORA_LLM_API_KEY_ENV | OPENAI_API_KEY | Environment variable holding the API key. |
KHORA_LLM_TEMPERATURE | 0.7 | Sampling temperature. |
KHORA_LLM_MAX_TOKENS | 12288 | Max output tokens per extraction call. |
KHORA_LLM_TIMEOUT | 30 | Request timeout in seconds. |
KHORA_LLM_MAX_RETRIES | 3 | Retry budget on failure. |
KHORA_LLM_MAX_CONCURRENT_LLM_CALLS | 10 | Cap on concurrent in-flight LLM requests. |
KHORA_LLM_EMBEDDING_MODEL | text-embedding-3-small | Embedding model. |
KHORA_LLM_EMBEDDING_DIMENSION | 1536 | Must match your DB schema. |
KHORA_LLM_EXTRACTION_MODEL | - | Override extraction model (falls back to model). Haiku / Gemini Flash work well here. |
Pipeline (extraction)
Prefix:KHORA_PIPELINES_.
| Variable | Default | Description |
|---|---|---|
KHORA_PIPELINES_CHUNKING_STRATEGY | semantic | fixed, semantic, or recursive. |
KHORA_PIPELINES_CHUNK_SIZE | 512 | Target chunk size (tokens). |
KHORA_PIPELINES_CHUNK_OVERLAP | 50 | Overlap between chunks. |
KHORA_PIPELINES_CONVERSATION_TIME_GAP_MINUTES | 15 | Split conversations after this many quiet minutes. |
KHORA_PIPELINES_CONVERSATION_MAX_GROUP_SIZE | 50 | Max messages per conversation chunk. |
KHORA_PIPELINES_CONVERSATION_MIN_GROUP_SIZE | 2 | Merge groups below this size. |
KHORA_PIPELINES_EXTRACT_ENTITIES | true | Run the entity extractor. |
KHORA_PIPELINES_ENTITY_TYPES | PERSON,ORGANIZATION,CONCEPT,LOCATION | Entity type allowlist. |
KHORA_PIPELINES_SELECTIVE_EXTRACTION | true | KET-RAG selective extraction (cost reduction). |
KHORA_PIPELINES_EXTRACTION_IMPORTANCE_RATIO | 0.7 | Top fraction of chunks sent to LLM extraction. |
KHORA_PIPELINES_EXTRACTION_MIN_IMPORTANCE | 0.2 | Minimum importance threshold; chunks above this are always extracted. |
KHORA_PIPELINES_SKIP_EMBEDDING_ENTITY_TYPES | DATE,URL,EMAIL | Skip embeddings for these types when mention_count is low. |
KHORA_PIPELINES_SKIP_EMBEDDING_MENTION_THRESHOLD | 1 | Skip embedding for rare-mention entities of the above types. |
Query
Prefix:KHORA_QUERY_. See Retrieval for guidance.
| Variable | Default | Description |
|---|---|---|
KHORA_QUERY_DEFAULT_MODE | hybrid | vector, graph, hybrid, or all. |
KHORA_QUERY_MIN_CHUNK_SIMILARITY | 0.05 | Chunk similarity floor. |
KHORA_QUERY_MIN_ENTITY_SIMILARITY | 0.05 | Entity similarity floor. |
KHORA_QUERY_VECTOR_WEIGHT | 0.5 | Fusion weight. |
KHORA_QUERY_GRAPH_WEIGHT | 0.3 | Fusion weight. |
KHORA_QUERY_KEYWORD_WEIGHT | 0.2 | Fusion weight. |
KHORA_QUERY_APPLY_RECENCY_BIAS | false | Bias scoring towards newer documents. |
KHORA_QUERY_RECENCY_WEIGHT | 0.35 | How strong the recency bias is. |
KHORA_QUERY_ENABLE_HYDE | auto | HyDE query expansion: auto / always / never (legacy booleans normalize to always / never). See Retrieval. |
KHORA_QUERY_HYDE_NUM_HYPOTHETICALS | 1 | Number of hypothetical documents to generate (1–5). |
KHORA_QUERY_ENABLE_HYDE_CYPHER | false | Opt-in. Run LLM-picked parameterized Cypher templates as an extra retrieval channel for structured queries. See Retrieval. |
KHORA_QUERY_HYDE_CYPHER_LIMIT | 20 | Max entities returned per HyDE-Cypher template execution. |
KHORA_QUERY_ENABLE_RERANKING | true | Cross-encoder reranking of top candidates. |
KHORA_QUERY_TEMPORAL_SQL_PUSHDOWN | true | Push relative-date filters into SQL WHERE clauses. |
Telemetry
Khora has two independent telemetry paths. Spans and metrics (OpenTelemetry). Khora emits spans (@trace, trace_span()) and metrics through the OpenTelemetry API unconditionally. Whether they’re exported depends only on which TracerProvider / MeterProvider is installed, not on any KHORA_* variable. Install the [otel] extra and call configure_telemetry() (honors OTEL_* env vars), or install [logfire] and run logfire.configure(), and khora’s signals flow to your collector. With no provider configured, OTel returns a NonRecordingSpan and the helpers are near-free. See Observability for the full setup and the OTLP env-var contract.
Structured event log (PostgreSQL). Separately, khora can write structured LLMEvent / StorageEvent / PipelineEvent rows to a PostgreSQL table. This is opt-in and independent of the OTel path above.
| Variable | Default | Description |
|---|---|---|
KHORA_TELEMETRY_DATABASE_URL | - | PostgreSQL URL for the structured event collector. Unset → a zero-cost no-op collector. Does not affect OTel spans/metrics. |
KHORA_TELEMETRY_SERVICE_NAME | khora | Service tag attached to recorded events. |
Logging
Khora uses loguru. Callkhora.logging_config.setup_logging() once per process (or configure your own sinks with enqueue=True). See the Logging section of the khora CLAUDE.md for the full rationale. Short version: default loguru sinks are synchronous and will block an asyncio event loop on every logger.* call.
| Variable | Default | Description |
|---|---|---|
KHORA_NEO4J_LOG_LEVEL | - | Neo4j driver log level (DEBUG / INFO / WARNING / ERROR / CRITICAL, case-insensitive). Unset = no-op. See examples/neo4j_debug_logging.py. |
Secrets
Secret parameters, like API keys (OpenAI, Anthropic, etc.) are read from the environment variable named byKHORA_LLM_API_KEY_ENV (default OPENAI_API_KEY). Khora never reads credentials from disk. They come from the environment. Credentials are read once when KhoraConfig is constructed and bound into the connection pools and the LLM client at startup, so rotating a secret takes effect on the next process start (or whenever you rebuild the config and reconnect). There’s no in-process reload.
Credential fields
Credential fields onKhoraConfig (PostgreSQL DSN, Neo4j password, LLM API key, telemetry DSN, etc.) are pydantic.SecretStr. This has two operator-visible consequences:
repr()and config-dump output render the value as'**********'. Logs, error messages, andKhoraConfig().model_dump()do not leak cleartext credentials.- Code that reads the cleartext value must call
.get_secret_value()explicitly. SQLAlchemy engines and graph drivers receive the cleartext at the boundary. Downstream library consumers must do the same. See the consumers guide for the integration note.
Lockfile policy
khora’spyproject.toml includes [tool.uv] exclude-newer = "7 days", a relative, evaluated-on-every-sync guard against pulling brand-new upstream releases that haven’t had time to stabilise. Security-critical packages opt out via exclude-newer-package (currently only urllib3 for CVE-2026-44431 / CVE-2026-44432). Downstream consumers that mirror khora’s pin policy inherit the same 7-day staging window for transitive dependencies; override per-package as needed.
Nested env vars
Reference for every Khora environment variable that lives on a sub-object attached to a sub-settings class: graph backend, vector backend, the SQLite+LanceDB embedded stack, and the dream-phase per-op toggles.Spelling. All env vars in this section use single underscore between every level:KHORA_STORAGE_GRAPH_URL, notKHORA_STORAGE__GRAPH__URL. The legacy double-underscore form continues to work as a backwards-compatible alias on every nested-config field. It is no longer documented. New code and.envfiles should use the single-underscore form.
Neo4j graph backend
Configuration for the Neo4j graph backend (storage.graph).
| Variable | Default | Why change it |
|---|---|---|
KHORA_STORAGE_GRAPH_BACKEND | neo4j | Graph adapter. Neo4j on the production stack. |
KHORA_STORAGE_GRAPH_URL | — | Bolt / connection URL. SecretStr. |
KHORA_STORAGE_GRAPH_USER | neo4j | Username. |
KHORA_STORAGE_GRAPH_PASSWORD | empty | Password. SecretStr. |
KHORA_STORAGE_GRAPH_DATABASE | neo4j | Multi-database selector inside a Neo4j cluster. |
KHORA_STORAGE_GRAPH_MAX_CONNECTION_POOL_SIZE | 100 | Lower on small drivers; raise for high concurrency. |
KHORA_STORAGE_GRAPH_CONNECTION_ACQUISITION_TIMEOUT | 60.0 s | Lower for fast-fail under pool starvation. |
KHORA_STORAGE_GRAPH_RETRY_DELAY_JITTER_FACTOR | 0.5 | Jitter (0.0–1.0) on transaction-retry backoff. Raise to spread retry storms. |
KHORA_STORAGE_GRAPH_MAX_CONNECTION_LIFETIME | 900 s | Rotate connections before this. Set below your server-side TTL (Aura ~20 min) to avoid BrokenPipe. |
KHORA_STORAGE_GRAPH_LIVENESS_CHECK_TIMEOUT | 30.0 s | Idle threshold before pre-checkout liveness check. None disables. |
KHORA_STORAGE_GRAPH_QUERY_TIMEOUT | 5.0 s | Per-transaction read timeout (1–300 s, None disables). Raise for deep traversals; lower to fail fast. |
KHORA_STORAGE_GRAPH_ENTITY_WRITE_CONCURRENCY | 12 | Concurrent entity-write transactions during ingest. Raise when Neo4j has headroom; lower on lock contention. |
KHORA_STORAGE_GRAPH_RELATIONSHIP_WRITE_CONCURRENCY | 8 | Concurrent relationship-write transactions. |
KHORA_STORAGE_GRAPH_POOL_SAMPLER_ENABLED | false | Opt-in high-frequency pool sampler. Requires an OTel backend installed. Enable to investigate pool exhaustion; zero-cost when off. |
KHORA_STORAGE_GRAPH_POOL_SAMPLER_INTERVAL_MS | 500 | Sample cadence in ms (clamped 50–60000). Drop to 50–100 when chasing sub-second pool events. |
KHORA_STORAGE_GRAPH_RELATIONSHIP_SOURCE_DOCUMENT_IDS_MAX | 100 | Cap on Relationship.source_document_ids retained after MERGE. When the cap is exceeded, the most-recent tail is kept and the dropped count is recorded on khora.neo4j.relationship.source_id_truncated{field=source_document_ids}. Raise for deep-provenance workloads. |
KHORA_STORAGE_GRAPH_RELATIONSHIP_SOURCE_CHUNK_IDS_MAX | 250 | Same as above, for source_chunk_ids. |
KHORA_STORAGE_GRAPH_ENTITY_SOURCE_DOCUMENT_IDS_MAX | 100 | Cap on Entity.source_document_ids retained after MERGE. Tail-keep semantics identical to the relationship cap; metric is khora.neo4j.entity.source_id_truncated{field=source_document_ids}. |
KHORA_STORAGE_GRAPH_ENTITY_SOURCE_CHUNK_IDS_MAX | 250 | Same as above, for entity source_chunk_ids. |
Vector backend
Discriminated union overPgVectorConfig | SQLiteVectorConfig, keyed by backend. Default is pgvector via default_factory=PgVectorConfig.
| Variable | Default | Applies when | Why change it |
|---|---|---|---|
KHORA_STORAGE_VECTOR_BACKEND | pgvector | always | pgvector / sqlite. |
KHORA_STORAGE_VECTOR_URL | — | pgvector / sqlite | Connection URL. |
KHORA_STORAGE_VECTOR_EMBEDDING_DIMENSION | 1536 | always | Must match the LLM embedding model. Changing requires a schema migration. |
backend=sqlite, only BACKEND / URL / EMBEDDING_DIMENSION are model-exposed.
Embedded backend
Used whenKHORA_STORAGE_BACKEND=sqlite_lance. Pairs an on-disk SQLite database (graph + relational + event store) with a sibling LanceDB directory (vector search). Zero infrastructure: both backends run in-process.
| Variable | Default | Why change it |
|---|---|---|
KHORA_STORAGE_SQLITE_LANCE_DB_PATH | ./khora.db | SQLite file path. Move to faster storage or a backup-friendly location. |
KHORA_STORAGE_SQLITE_LANCE_LANCE_PATH | sibling .lance dir | Explicit LanceDB directory. Override to put vectors on different storage (e.g. SSD) than chunk metadata. |
KHORA_STORAGE_SQLITE_LANCE_EMBEDDING_DIMENSION | 1536 | Must match LLM embedding model. |
KHORA_STORAGE_SQLITE_LANCE_USE_HALFVEC | false | Float16 storage: halves index size with minor recall loss. Enable on memory-constrained boxes. |
KHORA_STORAGE_SQLITE_LANCE_LANCE_INDEX | auto | auto / ivf_pq / hnsw / brute. Force ivf_pq above ~1M rows; hnsw for low-latency under ~1M; brute for tiny corpora. |
KHORA_STORAGE_SQLITE_LANCE_IVF_PARTITIONS | null (auto) | Hand-tuned IVF partition count (lance_index=ivf_pq only). Override only if profiling shows recall miss. |
KHORA_STORAGE_SQLITE_LANCE_HNSW_M | 16 | HNSW max connections per layer (lance_index=hnsw only). Raise for recall; linear memory cost. |
KHORA_STORAGE_SQLITE_LANCE_RETRAIN_FACTOR | 2.0 | Trigger LanceDB ANN retrain when row count grows by this factor. Lower for fresher index; raise to defer re-training cost. <= 1.0 disables. |
Dream-phase per-op toggles
DreamConfig.ops: DreamOpsConfig carries per-operation enable flags. Every destructive op defaults to false. KHORA_DREAM_ENABLED=true alone runs no destructive work, and each op must be flipped explicitly.
See Dream phase for operational guidance, retention floors, and the kill-switch (KHORA_DREAM_DISABLE_APPLY).
| Variable | Default | Why change it |
|---|---|---|
KHORA_DREAM_OPS_DEDUPE_ENTITIES | false | Enable cross-batch entity dedupe (cosine-merge with verifier). Turn on after validating planner output in dry-run. |
KHORA_DREAM_OPS_PRUNE_EDGES | false | Remove low-confidence / orphaned edges (default targets ASSOCIATED_WITH co-occurrence). Turn on when edge soup degrades retrieval. |
KHORA_DREAM_OPS_COMPACT_FACTS | false | Hard-delete tombstoned memory_facts rows past the 7-day retention floor. The only hard-delete op. Flip with care. |
KHORA_DREAM_OPS_CLUSTER_EVENTS | false | Merge near-duplicate events (cosine ≥ 0.95 within a 7-day window). |
KHORA_DREAM_OPS_RECOMPUTE_CENTROIDS | false | Recompute entity / cluster centroid embeddings after dedupe. Pair with DEDUPE_ENTITIES. |
Env vars that are not nested
These reach top-level fields of each sub-settings class via the sub-class’s ownenv_prefix. There is no sub-object hop, and they’re covered in the sections above:
KHORA_STORAGE_BACKEND,KHORA_STORAGE_POSTGRESQL_*,KHORA_STORAGE_HNSW_*,KHORA_STORAGE_USE_HALFVEC: flat onStorageSettings.KHORA_LLM_*: every field onLLMSettingsis top-level (no sub-objects).KHORA_PIPELINES_*: every field onPipelineSettingsis top-level.KHORA_QUERY_*: every field onQuerySettingsis top-level.KHORA_TELEMETRY_*: flat.KHORA_DREAM_*(e.g.KHORA_DREAM_ENABLED,KHORA_DREAM_DEFAULT_MODE,KHORA_DREAM_LLM_MAX_TOKENS_PER_RUN): flat onDreamConfig. Only the per-op toggles under theops:sub-object are listed above.