Full Configuration

Khora is configured through environment variables prefixed KHORA_ or a KhoraConfig instance constructed programmatically. Both paths are backed by the same pydantic-settings model in src/khora/config/schema.py.

Two ways to configure

Environment variables

All settings use the KHORA_ prefix with single-underscore separators for nested fields. Examples:

KHORA_DATABASE_URL=postgresql://khora:khora@localhost:5434/khora
KHORA_NEO4J_URL=bolt://neo4j:pleaseletmein@localhost:7688
KHORA_LLM_MODEL=gpt-4o
KHORA_QUERY_ENABLE_HYDE=auto
KHORA_QUERY_DEFAULT_MODE=hybrid

Legacy double-underscore nesting (KHORA_STORAGE__GRAPH__URL) is still accepted as a backwards-compatible alias on every nested-config field. New code and .env files should use the single-underscore form shown throughout this document. The legacy form continues to work but is no longer documented. Nested-object env vars (graph backend, vector backend, dream-phase per-op toggles) are documented in the Nested env vars section below.

Programmatic

from khora import KhoraConfig, Khora
from khora.config.schema import StorageSettings, LLMSettings

config = KhoraConfig(
    database_url="postgresql://khora:khora@localhost:5434/khora",
    neo4j_url="bolt://neo4j:pleaseletmein@localhost:7688",
    llm=LLMSettings(model="gpt-4o", embedding_model="text-embedding-3-small"),
)

async with Khora(config) as kb:
    ...

Programmatic values take priority over environment variables.

Install extras

Extra	Purpose	Pulls in
(default)	Core: PostgreSQL + pgvector + Neo4j driver + litellm	-
`sqlite`	SQLite embedded relational + vector	`aiosqlite>=0.21.0`
`lancedb`	LanceDB embedded vector store	`lancedb>=0.30.0`, `pyarrow>=24.0.0`
`sqlite-lance`	Unified SQLite + LanceDB embedded backend, the recommended embedded stack for VectorCypher	`lancedb>=0.30.0`, `aiosqlite>=0.21.0`, `pyarrow>=24.0.0`
`binary-readers`	docx / xlsx readers (used by downstream ingestors)	`openpyxl>=3.1.0`, `python-docx>=1.2.0`
`parquet`	Parquet readers	`pyarrow>=24.0.0`
`accel`	Accelerated CPU ops (string-matching fuzz, used by dream-phase centroid recompute)	`rapidfuzz>=3.0.0`
`nlp`	spaCy-based sentence splitting	`spacy>=3.8.0`
`otel`	OpenTelemetry SDK + OTLP/HTTP exporter (vendor-neutral)	`opentelemetry-sdk>=1.34.1`, `opentelemetry-exporter-otlp-proto-http>=1.34.1`
`otel-grpc`	`khora[otel]` + OTLP/gRPC transport	adds `opentelemetry-exporter-otlp-proto-grpc>=1.34.1`
`logfire`	Logfire - managed OTel backend with auto-bootstrap	`logfire>=4.6.0`
`rust`	Rust acceleration (`khora-accel`); pin tracks khora’s own version in lockstep	`khora-accel` (exact lockstep pin)

Combine extras as needed: pip install 'khora[rust,otel]'. See Observability for the full description of open telemetry env-var contract, precedence rules, and vendor recipes. Khora always exposes the OTel API. The [otel] and [logfire] extras determine where spans/metrics go.

Core settings

Variable	Type	Default	Description
`KHORA_DATABASE_URL`	str	-	PostgreSQL URL (shortcut for `storage.postgresql_url`).
`KHORA_NEO4J_URL`	str	-	Neo4j URL (shortcut for `storage.graph.url`).
`KHORA_LLM_EXTRACTION_MODEL`	str	-	Override extraction model (shortcut for `llm.extraction_model`).
`KHORA_DEBUG`	bool	`false`	Enable debug-level logging.
`KHORA_ENVIRONMENT`	str	`development`	`development`, `staging`, or `production`.
`KHORA_APP_NAME`	str	`khora`	Used in logs and telemetry.

Storage

Prefix: KHORA_STORAGE_. See Storage backends for the full backend matrix.

Variable	Default	Description
`KHORA_STORAGE_BACKEND`	`postgres`	`postgres` (PostgreSQL + pgvector + Neo4j) or `sqlite_lance` (SQLite + LanceDB embedded).
`KHORA_STORAGE_POSTGRESQL_URL`	-	PostgreSQL connection URL.
`KHORA_STORAGE_POSTGRESQL_POOL_SIZE`	`50`	asyncpg pool size.
`KHORA_STORAGE_POSTGRESQL_MAX_OVERFLOW`	`30`	Max overflow connections.
`KHORA_STORAGE_POSTGRESQL_POOL_PRE_PING`	`false`	Validate connections before checkout (adds latency, prevents stale-connection errors).
`KHORA_STORAGE_HNSW_M`	`24`	HNSW index `M` (max connections per layer).
`KHORA_STORAGE_HNSW_EF_CONSTRUCTION`	`128`	Build-time HNSW search width.
`KHORA_STORAGE_HNSW_EF_SEARCH`	`100`	Query-time HNSW search width.
`KHORA_STORAGE_USE_HALFVEC`	`true`	Use `halfvec` (float16) for HNSW indexes. Requires pgvector >= 0.7.0; falls back gracefully.
`KHORA_STORAGE_POSTGRESQL_UPSERT_COMMIT_INTERVAL`	`0`	Commit granularity for batch upserts. `0` commits the whole batch in one transaction (fastest for single-writer ingest); `1` commits per sub-batch so the namespace lock releases for concurrent same-namespace writers.
`KHORA_STORAGE_HNSW_PARTIAL_ENABLED`	`false`	Opt-in operator-driven promotion of hot namespaces to per-namespace partial HNSW indexes. The promotion helpers no-op unless this is on; never auto-creates on writes.
`KHORA_STORAGE_HNSW_PARTIAL_MIN_ROWS`	`50000`	Chunk-count a namespace must cross before it’s eligible for a per-namespace partial index.
`KHORA_STORAGE_HNSW_PARTIAL_MAX_INDEXES`	`64`	Ceiling on partial indexes per table (caps catalog bloat / planner overhead). Promotion beyond this is refused, not raised.

Graph and vector backends nest under storage.graph and storage.vector. The flat fields KHORA_STORAGE_NEO4J_URL, KHORA_STORAGE_NEO4J_USER, KHORA_STORAGE_NEO4J_PASSWORD, KHORA_STORAGE_PGVECTOR_URL, and KHORA_STORAGE_EMBEDDING_DIMENSION remain supported as a back-compat path and are migrated into the discriminated-union configs automatically.

Neo4j pool metrics

With any OTel backend installed ([otel] or [logfire]), the Neo4j backend emits OTel metrics automatically. See Observability. For high-frequency sub-minute sampling enable:

KHORA_STORAGE_GRAPH_POOL_SAMPLER_ENABLED=true
KHORA_STORAGE_GRAPH_POOL_SAMPLER_INTERVAL_MS=500    # clamped to [50, 60000]

Neo4j relationship limits

Relationship.source_document_ids and Relationship.source_chunk_ids are append-bounded on every MERGE to prevent unbounded growth on hot edges. Defaults are 100 and 250 respectively. For deep-provenance workloads, where many documents contribute to the same edge, raise the relevant knob and watch the khora.neo4j.relationship.source_id_truncated metric:

KHORA_STORAGE_GRAPH_RELATIONSHIP_SOURCE_DOCUMENT_IDS_MAX=500
KHORA_STORAGE_GRAPH_RELATIONSHIP_SOURCE_CHUNK_IDS_MAX=1000

See the Neo4j nested-env-var table below for the full Neo4j-and-friends table, including the matching entity-side caps. When the (existing + incoming) union exceeds the cap, the most-recent tail is kept, dropped entries are counted on the metric, and a logger.warning(...) records the field name, dropped count, rows affected, and configured limit.

Embedded backends

The embedded sqlite_lance path is appropriate for demos, evaluation, tests, and small single-user CLIs. It is not the deployment story. For production, use PostgreSQL + pgvector + Neo4j. Documented scale ceiling, where performance and recall degrade noticeably above these thresholds:

~1M chunks (LanceDB IVF-PQ training time + write serialisation start to dominate)
~100k entities (recursive-CTE traversal cost on hub nodes)
~500k relationships
Traversal depth ≤3 (the instr(walk.visited, ...) visited-set scan in graph.py is O(depth × fan-out × visited-len) and degrades sharply at depth ≥4 with high fan-out)

Known gaps and warts:

Partial atomicity in coordinator.transaction(): only the SQL session is enrolled. LanceDB writes happen post-commit with compensating-delete-on-failure. A crash between SQLite commit and Lance write can leave orphaned vectors or missing embeddings, and reconciliation runs on the next ingest.
Point-in-time queries degrade (they don’t raise) on the embedded stack. A target_date query no longer raises NotImplementedError: entity-version narrowing is skipped (current-state entities are returned) and a Degradation is recorded on RecallResult.engine_info["degradations"]. Occurred-time bounds (start_time / end_time) still narrow chunks normally. The bi-temporal entity versioning that powers true point-in-time lives in Neo4j (version_valid_from / version_valid_to on :Entity / :EntityVersion nodes), which sqlite_lance has no equivalent of.
FTS5 covers chunks only: entity-anchored recall falls back to LIKE / JSON-equality. Recommend the PostgreSQL stack for entity-heavy corpora.
Install footprint is ~130–180 MB unpacked (pyarrow + lancedb native + Arrow C++ runtime). “Embedded” means “no server”, not “no native deps”.
IVF-PQ retraining is automatic when the corpus grows past retrain_factor × (rows at last training). Tune via KHORA_STORAGE_SQLITE_LANCE_RETRAIN_FACTOR.

Vector index tuning lives on the sqlite_lance storage sub-config. See the KHORA_STORAGE_SQLITE_LANCE_* table below for DB_PATH, LANCE_PATH, EMBEDDING_DIMENSION, USE_HALFVEC, LANCE_INDEX, IVF_PARTITIONS, HNSW_M, and RETRAIN_FACTOR with defaults and tuning guidance.

LLM

Prefix: KHORA_LLM_. LiteLLM handles the provider dispatch.

Variable	Default	Description
`KHORA_LLM_MODEL`	`gpt-4o-mini`	Primary model for generation.
`KHORA_LLM_API_KEY_ENV`	`OPENAI_API_KEY`	Environment variable holding the API key. Left at the OpenAI default, it auto-derives from the model prefix (`gemini/`, `claude`, `anthropic/`, `vertex_ai/`, …) so the wrong provider’s key isn’t read. An explicit value is always honored.
`KHORA_LLM_TEMPERATURE`	`0.7`	Sampling temperature.
`KHORA_LLM_MAX_TOKENS`	`12288`	Max output tokens per extraction call.
`KHORA_LLM_TIMEOUT`	`30`	Request timeout in seconds.
`KHORA_LLM_MAX_RETRIES`	`3`	Retry budget on failure.
`KHORA_LLM_MAX_CONCURRENT_LLM_CALLS`	`10`	Cap on concurrent in-flight LLM requests.
`KHORA_LLM_EXTRACTION_WAVE_SIZE`	`20`	Extraction batches dispatched concurrently per wave; the circuit breaker is checked between waves. Above `MAX_CONCURRENT_LLM_CALLS` has no effect (the per-call semaphore binds).
`KHORA_LLM_MAX_TOTAL_CONNECTIONS`	`200`	Total cap on simultaneous connections in the shared aiohttp session, across all hosts.
`KHORA_LLM_MAX_CONNECTIONS_PER_HOST`	`0`	Per-host connection cap for the shared session. `0` = unlimited.
`KHORA_LLM_KEEPALIVE_TIMEOUT_S`	`30.0`	Idle keepalive seconds for shared-session connections.
`KHORA_LLM_EMBEDDING_MODEL`	`text-embedding-3-small`	Embedding model.
`KHORA_LLM_EMBEDDING_DIMENSION`	`1536`	Must match your DB schema.
`KHORA_LLM_EXTRACTION_MODEL`	-	Override extraction model (falls back to `model`). Haiku / Gemini Flash work well here.

For multi-model routing, LiteLLM’s router is configurable via config_file (path to a LiteLLM config YAML), model_list, and router_settings on KhoraConfig.llm.

Pipeline (extraction)

Prefix: KHORA_PIPELINES_.

Variable	Default	Description
`KHORA_PIPELINES_CHUNKING_STRATEGY`	`semantic`	`fixed`, `semantic`, or `recursive`.
`KHORA_PIPELINES_CHUNK_SIZE`	`512`	Target chunk size (tokens).
`KHORA_PIPELINES_CHUNK_OVERLAP`	`50`	Overlap between chunks.
`KHORA_PIPELINES_CONVERSATION_TIME_GAP_MINUTES`	`15`	Split conversations after this many quiet minutes.
`KHORA_PIPELINES_CONVERSATION_MAX_GROUP_SIZE`	`50`	Max messages per conversation chunk.
`KHORA_PIPELINES_CONVERSATION_MIN_GROUP_SIZE`	`2`	Merge groups below this size.
`KHORA_PIPELINES_EXTRACT_ENTITIES`	`true`	Run the entity extractor.
`KHORA_PIPELINES_ENTITY_TYPES`	`PERSON,ORGANIZATION,CONCEPT,LOCATION`	Entity type allowlist.
`KHORA_PIPELINES_SELECTIVE_EXTRACTION`	`true`	KET-RAG selective extraction (cost reduction).
`KHORA_PIPELINES_EXTRACTION_IMPORTANCE_RATIO`	`0.7`	Top fraction of chunks sent to LLM extraction (generic pipeline; on VectorCypher use `skeleton_core_ratio`).
`KHORA_PIPELINES_EXTRACTION_MIN_IMPORTANCE`	`0.2`	Minimum importance threshold; chunks above this are always extracted.
`KHORA_PIPELINES_EXTRACTION_SECOND_PASS`	`false`	Opt-in second, relationship-only extraction pass over under-connected sections on the batch path. Recovers ~30-40% more edges at extra LLM cost.
`KHORA_PIPELINES_CONVERSATION_SEMANTIC_THRESHOLD`	-	Optional cosine-similarity threshold for semantic splitting of conversations. Unset by default.
`KHORA_PIPELINES_SKIP_EMBEDDING_ENTITY_TYPES`	`DATE,URL,EMAIL`	Skip embeddings for these types when `mention_count` is low.
`KHORA_PIPELINES_SKIP_EMBEDDING_MENTION_THRESHOLD`	`1`	Skip embedding for rare-mention entities of the above types.
`KHORA_PIPELINES_PENDING_PROCESSOR_MAX_CONCURRENT`	`20`	Max documents processed concurrently by the pending processor (see `submit_batch`).
`KHORA_PIPELINES_PENDING_PROCESSOR_GRACE_PERIOD_MINUTES`	`5`	Minimum age a `PENDING` document must reach before crash-recovery picks it up, so it doesn’t race in-flight writes.
`KHORA_PIPELINES_PENDING_PROCESSOR_ORPHAN_STALE_AFTER_SECONDS`	`900`	Minimum age a `PROCESSING` document must reach before it’s reclaimed as a crashed-worker orphan.

Query

Prefix: KHORA_QUERY_. See Retrieval for guidance.

Variable	Default	Description
`KHORA_QUERY_DEFAULT_MODE`	`hybrid`	`vector`, `graph`, `hybrid`, `keyword`, or `all`.
`KHORA_QUERY_MIN_CHUNK_SIMILARITY`	`0.0`	Chunk-channel cosine floor (`0.0` = no floor; raise to opt in).
`KHORA_QUERY_MIN_ENTITY_SIMILARITY`	`0.05`	Entity similarity floor.
`KHORA_QUERY_VECTOR_WEIGHT`	`0.6`	Fusion weight for the vector channel.
`KHORA_QUERY_GRAPH_WEIGHT`	`0.4`	Fusion weight for the graph channel.
`KHORA_QUERY_KEYWORD_WEIGHT`	`0.3`	Fusion weight for the keyword channel. Fills the BM25 slot; inert until the keyword channel is enabled.
`KHORA_QUERY_ENABLE_BM25_CHANNEL`	`false`	Opt-in. Add an independent BM25 keyword channel to fusion alongside vector + graph. Off by default: the standard `recall()` path fuses vector + graph only.
`KHORA_QUERY_APPLY_RECENCY_BIAS`	`false`	Bias scoring towards newer documents.
`KHORA_QUERY_RECENCY_WEIGHT`	`0.35`	How strong the recency bias is.
`KHORA_QUERY_ENABLE_HYDE`	`auto`	HyDE query expansion: `auto` / `always` / `never` (legacy booleans normalize to `always` / `never`). See Retrieval.
`KHORA_QUERY_HYDE_NUM_HYPOTHETICALS`	`1`	Number of hypothetical documents to generate (1–5).
`KHORA_QUERY_ENABLE_HYDE_CYPHER`	`false`	Currently inert. The `khora.query.hyde_cypher` module (LLM-picked parameterized Cypher templates) exists but is not wired into retrieval. No code reads this flag, so setting it has no effect on `recall()`.
`KHORA_QUERY_HYDE_CYPHER_LIMIT`	`20`	Currently inert. See `KHORA_QUERY_ENABLE_HYDE_CYPHER`.
`KHORA_QUERY_ENABLE_RERANKING`	`true`	Cross-encoder reranking of top candidates, on by default. With the defaults the first `recall()` loads `BAAI/bge-reranker-v2-m3` (~2.3 GB, GPU-preferred). Set `false` to disable.
`KHORA_QUERY_RERANKING_MODEL`	`BAAI/bge-reranker-v2-m3`	Any model loadable by `CrossEncoder(name)`. Use `cross-encoder/ms-marco-MiniLM-L-6-v2` for a light CPU default. See Reranking.
`KHORA_QUERY_RERANKING_TOP_N`	`50`	Candidates fed to the reranker.
`KHORA_QUERY_RERANKING_BLEND_WEIGHT`	`0.7`	Reranker-score weight when blending with the original fused score. `0.7` is 70% reranker, 30% original.
`KHORA_QUERY_ENABLE_LLM_RERANKING`	`false`	Opt-in. LLM listwise reranking, applied after the cross-encoder stage on temporal queries.
`KHORA_QUERY_LLM_RERANKING_MODEL`	`gpt-4o-mini`	Model for the LLM listwise pass.
`KHORA_QUERY_LLM_RERANKING_TOP_N`	`10`	Top candidates sent to the LLM reranker (3–30).
`KHORA_QUERY_LLM_RERANKING_CONFIDENCE_THRESHOLD`	`0.1`	Fire LLM reranking only when the cross-encoder’s rank-1-vs-rank-2 score gap is below this.
`KHORA_QUERY_TEMPORAL_SQL_PUSHDOWN`	`true`	Push relative-date filters into SQL WHERE clauses.
`KHORA_QUERY_ENABLE_RESULT_CACHE`	`false`	Opt-in epoch-invalidated recall result cache. An identical repeat recall skips channel execution. Useful for repeat-query workloads (agent loops, evals).
`KHORA_QUERY_RESULT_CACHE_MAX_SIZE`	`1000`	Max cached recall results before LRU eviction (`0` disables the cache).
`KHORA_QUERY_RESULT_CACHE_TTL_SECONDS`	`300`	Time-to-live per cached recall result.

Two reranking variables don’t reach recall(). The default VectorCypher engine reconciles the query.reranking_* family onto its own config, so the variables above apply to it. KHORA_QUERY_RERANKING_METHOD and KHORA_QUERY_RERANKING_FINAL_K are the exception: they exist on QuerySettings but have no VectorCypherConfig equivalent, so they affect only the separate HybridQueryEngine. Setting them changes nothing on the default recall path. For per-engine overrides and model choice, see Reranking.

Abstention

When recall finds nothing on-topic, Khora can flag the result rather than return weak matches (see Retrieval). The scoring is tunable:

Variable	Default	Description
`KHORA_QUERY_ABSTENTION_MODE`	`cosine_floor`	`cosine_floor` (abstain when the top raw cosine is below `ABSTENTION_MIN_TOP_SCORE`) or `weighted` (combine several signals into one score, compared against `ABSTENTION_COMBINED_THRESHOLD`).
`KHORA_QUERY_ABSTENTION_MIN_TOP_SCORE`	`0.3`	Raw top-chunk cosine below which the result is considered off-topic. Model- and corpus-specific.
`KHORA_QUERY_ABSTENTION_COMBINED_THRESHOLD`	`0.5`	In `weighted` mode, the combined-score threshold at or above which recall abstains.

weighted mode also exposes per-signal weights (ABSTENTION_WEIGHT_ENTITIES_EMPTY, ABSTENTION_WEIGHT_CHUNKS_BELOW_MIN, ABSTENTION_WEIGHT_TOP_SCORE_LOW) and confidence targets (ABSTENTION_CONFIDENCE_TARGET_COSINE, ABSTENTION_CONFIDENCE_TARGET_GAP); the three weights must sum to ≤ 1.0.

Experimental query knobs

These are real but opt-in and not yet validated (default off, A/B benchmarking pending upstream). Documented for completeness. Leave them at their defaults unless you’re running your own evaluation.

Variable	Default	Description
`KHORA_QUERY_FUSION_MODE`	`rrf`	`rrf` (rank-only weighted RRF, the shipped behavior) or `calibrated` (magnitude-aware blend of per-channel scores, so a lone strong-cosine hit isn’t buried). Experimental.
`KHORA_QUERY_LEXICAL_CHANNEL`	`bm25`	Which retriever fills the lexical slot: `bm25` or `keyword_ppr` (experimental keyword-chunk PageRank; needs a re-ingest to populate `keyword_chunks`, and has its own `KEYWORD_PPR_DAMPING` / `KEYWORD_PPR_MAX_EDGES` tunables).
`KHORA_QUERY_ENABLE_PPR_RETRIEVAL`	`false`	Replace the graph BFS channel with query-time Personalized PageRank (HippoRAG 2), falling back to vector-only on an empty graph. Experimental, with a family of `KHORA_QUERY_PPR_*` tunables.

Telemetry

Khora has two independent telemetry paths. Spans and metrics (OpenTelemetry). Khora emits spans (@trace, trace_span()) and metrics through the OpenTelemetry API unconditionally. Whether they’re exported depends only on which TracerProvider / MeterProvider is installed, not on any KHORA_* variable. Install the [otel] extra and call configure_telemetry() (honors OTEL_* env vars), or install [logfire] and run logfire.configure(), and khora’s signals flow to your collector. With no provider configured, OTel returns a NonRecordingSpan and the helpers are near-free. See Observability for the full setup and the OTLP env-var contract. Structured event log (PostgreSQL). Separately, khora can write structured LLMEvent / StorageEvent / PipelineEvent rows to a PostgreSQL table. This is opt-in and independent of the OTel path above.

Variable	Default	Description
`KHORA_TELEMETRY_DATABASE_URL`	-	PostgreSQL URL for the structured event collector. Unset → a zero-cost no-op collector. Does not affect OTel spans/metrics.
`KHORA_TELEMETRY_SERVICE_NAME`	`khora`	Service tag attached to recorded events.

Logging

Khora uses loguru. Call khora.logging_config.setup_logging() once per process (or configure your own sinks with enqueue=True). See the Logging section of the khora CLAUDE.md for the full rationale. Short version: default loguru sinks are synchronous and will block an asyncio event loop on every logger.* call.

Variable	Default	Description
`KHORA_NEO4J_LOG_LEVEL`	-	Neo4j driver log level (`DEBUG` / `INFO` / `WARNING` / `ERROR` / `CRITICAL`, case-insensitive). Unset = no-op. See `examples/neo4j_debug_logging.py`.

Secrets

Secret parameters, like API keys (OpenAI, Anthropic, etc.) are read from the environment variable named by KHORA_LLM_API_KEY_ENV (default OPENAI_API_KEY). Khora never reads credentials from disk. They come from the environment. Credentials are read once when KhoraConfig is constructed and bound into the connection pools and the LLM client at startup, so rotating a secret takes effect on the next process start (or whenever you rebuild the config and reconnect). There’s no in-process reload.

Credential fields

Credential fields on KhoraConfig (PostgreSQL DSN, Neo4j password, LLM API key, telemetry DSN, etc.) are pydantic.SecretStr. This has two operator-visible consequences:

repr() and config-dump output render the value as '**********'. Logs, error messages, and KhoraConfig().model_dump() do not leak cleartext credentials.
Code that reads the cleartext value must call .get_secret_value() explicitly. SQLAlchemy engines and graph drivers receive the cleartext at the boundary. Downstream library consumers must do the same.

from khora.config import KhoraConfig

cfg = KhoraConfig()
print(cfg.storage.postgresql_url)             # SecretStr('**********')
dsn = cfg.storage.postgresql_url.get_secret_value()   # cleartext, for engine init

Lockfile policy

khora’s pyproject.toml includes [tool.uv] exclude-newer = "7 days", a relative, evaluated-on-every-sync guard against pulling brand-new upstream releases that haven’t had time to stabilise. Security-critical packages opt out via exclude-newer-package (currently only urllib3 for CVE-2026-44431 / CVE-2026-44432). Downstream consumers that mirror khora’s pin policy inherit the same 7-day staging window for transitive dependencies; override per-package as needed.

Nested env vars

Reference for every Khora environment variable that lives on a sub-object attached to a sub-settings class: graph backend, vector backend, the SQLite+LanceDB embedded stack, and the dream-phase per-op toggles.

Spelling. All env vars in this section use single underscore between every level: KHORA_STORAGE_GRAPH_URL, not KHORA_STORAGE__GRAPH__URL. The legacy double-underscore form continues to work as a backwards-compatible alias on every nested-config field. It is no longer documented. New code and .env files should use the single-underscore form.

Neo4j graph backend

Configuration for the Neo4j graph backend (storage.graph).

Variable	Default	Why change it
`KHORA_STORAGE_GRAPH_BACKEND`	`neo4j`	Graph adapter. Neo4j on the production stack.
`KHORA_STORAGE_GRAPH_URL`	—	Bolt / connection URL. `SecretStr`.
`KHORA_STORAGE_GRAPH_USER`	`neo4j`	Username.
`KHORA_STORAGE_GRAPH_PASSWORD`	empty	Password. `SecretStr`.
`KHORA_STORAGE_GRAPH_DATABASE`	`neo4j`	Multi-database selector inside a Neo4j cluster.
`KHORA_STORAGE_GRAPH_MAX_CONNECTION_POOL_SIZE`	`100`	Lower on small drivers; raise for high concurrency.
`KHORA_STORAGE_GRAPH_CONNECTION_ACQUISITION_TIMEOUT`	`60.0` s	Lower for fast-fail under pool starvation.
`KHORA_STORAGE_GRAPH_RETRY_DELAY_JITTER_FACTOR`	`0.5`	Jitter (0.0–1.0) on transaction-retry backoff. Raise to spread retry storms.
`KHORA_STORAGE_GRAPH_MAX_CONNECTION_LIFETIME`	`900` s	Rotate connections before this. Set below your server-side TTL (Aura ~20 min) to avoid `BrokenPipe`.
`KHORA_STORAGE_GRAPH_LIVENESS_CHECK_TIMEOUT`	`30.0` s	Idle threshold before pre-checkout liveness check. `None` disables.
`KHORA_STORAGE_GRAPH_QUERY_TIMEOUT`	`5.0` s	Per-transaction read timeout (1–300 s, `None` disables). Raise for deep traversals; lower to fail fast.
`KHORA_STORAGE_GRAPH_ENTITY_WRITE_CONCURRENCY`	`12`	Concurrent entity-write transactions during ingest. Raise when Neo4j has headroom; lower on lock contention.
`KHORA_STORAGE_GRAPH_RELATIONSHIP_WRITE_CONCURRENCY`	`8`	Concurrent relationship-write transactions.
`KHORA_STORAGE_GRAPH_POOL_SAMPLER_ENABLED`	`false`	Opt-in high-frequency pool sampler. Requires an OTel backend installed. Enable to investigate pool exhaustion; zero-cost when off.
`KHORA_STORAGE_GRAPH_POOL_SAMPLER_INTERVAL_MS`	`500`	Sample cadence in ms (clamped 50–60000). Drop to 50–100 when chasing sub-second pool events.
`KHORA_STORAGE_GRAPH_POOL_KEEPALIVE_ENABLED`	`false`	Opt-in keepalive that pings idle pooled connections so an intermediary (load balancer, firewall) doesn’t idle-drop them before the driver notices. Zero-cost when off.
`KHORA_STORAGE_GRAPH_POOL_KEEPALIVE_INTERVAL_MS`	`15000`	Ping cadence when keepalive is on (clamped 50–60000). Below the driver’s 30s liveness window so idle connections are exercised before they go stale.
`KHORA_STORAGE_GRAPH_RELATIONSHIP_SOURCE_DOCUMENT_IDS_MAX`	`100`	Cap on `Relationship.source_document_ids` retained after `MERGE`. When the cap is exceeded, the most-recent tail is kept and the dropped count is recorded on `khora.neo4j.relationship.source_id_truncated{field=source_document_ids}`. Raise for deep-provenance workloads.
`KHORA_STORAGE_GRAPH_RELATIONSHIP_SOURCE_CHUNK_IDS_MAX`	`250`	Same as above, for `source_chunk_ids`.
`KHORA_STORAGE_GRAPH_ENTITY_SOURCE_DOCUMENT_IDS_MAX`	`100`	Cap on `Entity.source_document_ids` retained after `MERGE`. Tail-keep semantics identical to the relationship cap; metric is `khora.neo4j.entity.source_id_truncated{field=source_document_ids}`.
`KHORA_STORAGE_GRAPH_ENTITY_SOURCE_CHUNK_IDS_MAX`	`250`	Same as above, for entity `source_chunk_ids`.

Vector backend

Discriminated union over PgVectorConfig | SQLiteVectorConfig, keyed by backend. Default is pgvector via default_factory=PgVectorConfig.

Variable	Default	Applies when	Why change it
`KHORA_STORAGE_VECTOR_BACKEND`	`pgvector`	always	`pgvector` / `sqlite`.
`KHORA_STORAGE_VECTOR_URL`	—	pgvector / sqlite	Connection URL.
`KHORA_STORAGE_VECTOR_EMBEDDING_DIMENSION`	`1536`	always	Must match the LLM embedding model. Changing requires a schema migration.

For backend=sqlite, only BACKEND / URL / EMBEDDING_DIMENSION are model-exposed.

Embedded backend

Used when KHORA_STORAGE_BACKEND=sqlite_lance. Pairs an on-disk SQLite database (graph + relational + event store) with a sibling LanceDB directory (vector search). Zero infrastructure: both backends run in-process.

Variable	Default	Why change it
`KHORA_STORAGE_SQLITE_LANCE_DB_PATH`	`./khora.db`	SQLite file path. Move to faster storage or a backup-friendly location.
`KHORA_STORAGE_SQLITE_LANCE_LANCE_PATH`	sibling `.lance` dir	Explicit LanceDB directory. Override to put vectors on different storage (e.g. SSD) than chunk metadata.
`KHORA_STORAGE_SQLITE_LANCE_EMBEDDING_DIMENSION`	`1536`	Must match LLM embedding model.
`KHORA_STORAGE_SQLITE_LANCE_USE_HALFVEC`	`false`	Float16 storage: halves index size with minor recall loss. Enable on memory-constrained boxes.
`KHORA_STORAGE_SQLITE_LANCE_LANCE_INDEX`	`auto`	`auto` / `ivf_pq` / `hnsw` / `brute`. Force `ivf_pq` above ~1M rows; `hnsw` for low-latency under ~1M; `brute` for tiny corpora.
`KHORA_STORAGE_SQLITE_LANCE_IVF_PARTITIONS`	`null` (auto)	Hand-tuned IVF partition count (`lance_index=ivf_pq` only). Override only if profiling shows recall miss.
`KHORA_STORAGE_SQLITE_LANCE_HNSW_M`	`16`	HNSW max connections per layer (`lance_index=hnsw` only). Raise for recall; linear memory cost.
`KHORA_STORAGE_SQLITE_LANCE_RETRAIN_FACTOR`	`2.0`	Trigger LanceDB ANN retrain when row count grows by this factor. Lower for fresher index; raise to defer re-training cost. `<= 1.0` disables.

Dream-phase per-op toggles

DreamConfig.ops: DreamOpsConfig carries per-operation enable flags. Every destructive op defaults to false. KHORA_DREAM_ENABLED=true alone runs no destructive work, and each op must be flipped explicitly. See Dream phase for operational guidance, retention floors, and the kill-switch (KHORA_DREAM_DISABLE_APPLY).

Variable	Default	Why change it
`KHORA_DREAM_OPS_DEDUPE_ENTITIES`	`false`	Enable cross-batch entity dedupe (cosine-merge with verifier). Turn on after validating planner output in dry-run.
`KHORA_DREAM_OPS_PRUNE_EDGES`	`false`	Remove low-confidence / orphaned edges (default targets `ASSOCIATED_WITH` co-occurrence). Turn on when edge soup degrades retrieval.
`KHORA_DREAM_OPS_COMPACT_FACTS`	`false`	Hard-delete tombstoned `memory_facts` rows past the 7-day retention floor. The only hard-delete op. Flip with care.
`KHORA_DREAM_OPS_CLUSTER_EVENTS`	`false`	Merge near-duplicate events (cosine ≥ 0.95 within a 7-day window).
`KHORA_DREAM_OPS_RECOMPUTE_CENTROIDS`	`false`	Recompute entity / cluster centroid embeddings after dedupe. Pair with `DEDUPE_ENTITIES`.

Env vars that are not nested

These reach top-level fields of each sub-settings class via the sub-class’s own env_prefix. There is no sub-object hop, and they’re covered in the sections above:

KHORA_STORAGE_BACKEND, KHORA_STORAGE_POSTGRESQL_*, KHORA_STORAGE_HNSW_*, KHORA_STORAGE_USE_HALFVEC: flat on StorageSettings.
KHORA_LLM_*: every field on LLMSettings is top-level (no sub-objects).
KHORA_PIPELINES_*: every field on PipelineSettings is top-level.
KHORA_QUERY_*: every field on QuerySettings is top-level.
KHORA_TELEMETRY_*: flat.
KHORA_DREAM_* (e.g. KHORA_DREAM_ENABLED, KHORA_DREAM_DEFAULT_MODE, KHORA_DREAM_LLM_MAX_TOKENS_PER_RUN): flat on DreamConfig. Only the per-op toggles under the ops: sub-object are listed above.

Getting started

Concepts

Operations

Experimental Features

Integrations

Reference

Examples

Full Configuration

Two ways to configure

Environment variables

Programmatic

Install extras

Core settings

Storage

Neo4j pool metrics

Neo4j relationship limits

Embedded backends

LLM

Pipeline (extraction)

Query

Abstention

Experimental query knobs

Telemetry

Logging

Secrets

Credential fields

Lockfile policy

Nested env vars

Neo4j graph backend

Vector backend

Embedded backend

Dream-phase per-op toggles

Env vars that are not nested

​Two ways to configure

​Environment variables

​Programmatic

​Install extras

​Core settings

​Storage

​Neo4j pool metrics

​Neo4j relationship limits

​Embedded backends

​LLM

​Pipeline (extraction)

​Query

​Abstention

​Experimental query knobs

​Telemetry

​Logging

​Secrets

​Credential fields

​Lockfile policy

​Nested env vars

​Neo4j graph backend

​Vector backend

​Embedded backend

​Dream-phase per-op toggles

​Env vars that are not nested

Two ways to configure

Environment variables

Programmatic

Install extras

Core settings

Storage

Neo4j pool metrics

Neo4j relationship limits

Embedded backends

LLM

Pipeline (extraction)

Query

Abstention

Experimental query knobs

Telemetry

Logging

Secrets

Credential fields

Lockfile policy

Nested env vars

Neo4j graph backend

Vector backend

Embedded backend

Dream-phase per-op toggles

Env vars that are not nested