Skip to main content
khora.integrations.hermes plugs Khora into Hermes as a long-term memory plane. Hermes owns the agent loop, model call, tool router, and context-compression policy. Khora owns storage: vector recall, the entity graph, temporal retrieval, and abstention signals. The integration is a single primitive: KhoraMemoryProvider.
Experimental. The upstream hermes-agent SDK is pre-1.0 and its MemoryProvider ABC has reshaped across minor releases. The adapter pins hermes-agent>=0.13,<0.14 and expects one maintenance PR per upstream minor. It’s promoted to stable once hermes-agent ships a full minor without reshaping MemoryProvider.

Install

There is no khora[hermes] extra. hermes-agent pins requests==2.33.0, which conflicts with Khora’s requests>=2.33.1 CVE floor, so the extra was removed. Install hermes-agent yourself:
pip install hermes-agent
The adapter is still registered under the khora.integrations entry-point group, so khora.integrations.discover() resolves it whenever hermes-agent is importable. If your project enforces the CVE floor, vendor or fork hermes-agent to relax its pin.

Wiring it in

from khora import Khora
from khora.integrations.hermes import KhoraMemoryProvider

kb = Khora()
await kb.connect()
provider = KhoraMemoryProvider(kb=kb)        # kb is REQUIRED — no Khora.shared() fallback
# hand `provider` to your Hermes context.register_memory_provider(...)
Alternatively, copy the runnable plugin at examples/integrations/hermes/plugin/ into $HERMES_HOME/plugins/khora/. Its register(ctx) defaults to KhoraMemoryProvider(kb=Khora.shared()) (override via the KHORA_HERMES_KB_FACTORY env var).

Namespace mapping

Each (agent_identity, session_id) pair maps to a deterministic Khora namespace (UUID5), so two providers for the same agent + session share memory across processes without a shared registry. agent_identity is the tenancy key. Different agents stay isolated even on the same session_id. The same session_id is stamped on every stored document, so kb.forget_session(namespace, session_id) cleanly drops a whole conversation.

Tools

Hermes registers two LLM-callable tools via provider.get_tool_schemas():
ToolForReturns
memory_search”What did Alice say about Phoenix?”, semantic recallA <memory-context> block of top-K chunks + entity hits
memory_recall”What did we discuss last week?”, adds before / after ISO-8601 boundsSame, filtered by the window
Both accept query (required), top_k (default 10, cap 50), and min_similarity (default 0.1). An empty result returns "No prior memories found.", an explicit abstention so the model doesn’t confabulate.

Threading model

This is the only adapter that bridges a sync caller (Hermes drives the provider from one thread per session) onto Khora’s async write path:
  • One ThreadPoolExecutor(max_workers=1) per provider. Strict FIFO, so ingestion order matches turn order. Async work routes through the shared run_sync bridge.
  • A TTL-bounded prefetch cache keyed on (namespace, session, query-hash) absorbs the “prefetch every turn” pattern. Concurrent readers wait on the same in-flight future instead of firing duplicate recalls.
  • Shed-oldest backpressure: at queue_max_size the oldest pending write is cancelled (counter khora.hermes.queue.shed_total).
Chat memory is best-effort. A hard crash mid-drain loses whatever is still queued; a clean SIGTERM drains up to drain_timeout_s. prefetch() may return the abstention payload when writes haven’t drained (better an empty context than a stalled turn). Use the tool-call path for guaranteed retrieval. Don’t os.fork() after constructing a provider (fork safety is a follow-up).

Key knobs

KhoraMemoryProvider constructor kwargs: kb (required), prefetch_timeout_s (0.8), prefetch_cache_ttl_s (30.0), queue_max_size (256), drain_timeout_s (5.0), failure_threshold_pct (1.0). Telemetry spans/counters are emitted under khora.integrations.hermes.* and khora.hermes.* (no namespace_id labels, free text is hashed).
menu_book

API reference

The stable public surface: construction, remember/recall/forget, and result types.
extension

Integrations overview

The full adapter lineup and the shared registry.