Hermes

khora.integrations.hermes plugs Khora into Hermes as a long-term memory plane. Hermes owns the agent loop, model call, tool router, and context-compression policy. Khora owns storage: vector recall, the entity graph, temporal retrieval, and abstention signals. The integration is a single primitive: KhoraMemoryProvider.

Experimental. The upstream hermes-agent SDK is pre-1.0 and its MemoryProvider ABC has reshaped across minor releases. The adapter pins hermes-agent>=0.13,<0.14 and expects one maintenance PR per upstream minor. It’s promoted to stable once hermes-agent ships a full minor without reshaping MemoryProvider.

Install

There is no khora[hermes] extra. hermes-agent pins requests==2.33.0, which conflicts with Khora’s requests>=2.33.1 CVE floor, so the extra was removed. Install hermes-agent yourself:

pip install hermes-agent

The adapter is still registered under the khora.integrations entry-point group, so khora.integrations.discover() resolves it whenever hermes-agent is importable. If your project enforces the CVE floor, vendor or fork hermes-agent to relax its pin.

Wiring it in

from khora import Khora
from khora.integrations.hermes import KhoraMemoryProvider

kb = Khora()
await kb.connect()
provider = KhoraMemoryProvider(kb=kb)        # kb is REQUIRED — no Khora.shared() fallback
# hand `provider` to your Hermes context.register_memory_provider(...)

Alternatively, copy the runnable plugin at examples/integrations/hermes/plugin/ into $HERMES_HOME/plugins/khora/. Its register(ctx) defaults to KhoraMemoryProvider(kb=Khora.shared()) (override via the KHORA_HERMES_KB_FACTORY env var).

Namespace mapping

Each (agent_identity, user_id) pair maps to a deterministic Khora namespace (UUID5 of hermes:{agent_identity}:{user_id}), so providers for the same agent and user share memory across processes without a shared registry. The stable identity (agent_identity, plus user_id when Hermes supplies one) is the tenancy key: different agents, and different users of the same agent, stay isolated. session_id is deliberately not folded into the namespace. Doing so would give every conversation its own namespace and void cross-session memory. It maps instead to khora’s first-class session_id column, stamped on every stored document, so kb.forget_session(namespace, derive_session_uuid(session_id)) cleanly drops a whole conversation.

Tools

Hermes registers two LLM-callable tools via provider.get_tool_schemas():

Tool	For	Returns
`memory_search`	”What did Alice say about Phoenix?”, semantic recall	A `<memory-context>` block of top-K chunks + entity hits
`memory_recall`	”What did we discuss last week?”, adds `before` / `after` ISO-8601 bounds	Same, filtered by the window

Both accept query (required), top_k (default 10, cap 50), and min_similarity (default 0.1). An empty result returns "No prior memories found.", an explicit abstention so the model doesn’t confabulate.

Threading model

This is the only adapter that bridges a sync caller (Hermes drives the provider from one thread per session) onto Khora’s async write path:

One ThreadPoolExecutor(max_workers=1) per provider. Strict FIFO, so ingestion order matches turn order. Async work routes through the shared run_sync bridge.
A TTL-bounded prefetch cache keyed on (namespace, session, query-hash) absorbs the “prefetch every turn” pattern. Concurrent readers wait on the same in-flight future instead of firing duplicate recalls.
Shed-oldest backpressure: at queue_max_size the oldest pending write is cancelled (counter khora.hermes.queue.shed_total).

Chat memory is best-effort. A hard crash mid-drain loses whatever is still queued; a clean SIGTERM drains up to drain_timeout_s. prefetch() may return the abstention payload when writes haven’t drained (better an empty context than a stalled turn). Use the tool-call path for guaranteed retrieval. Don’t os.fork() after constructing a provider (fork safety is a follow-up).

Key knobs

KhoraMemoryProvider constructor kwargs: kb (required), prefetch_timeout_s (0.8), prefetch_cache_ttl_s (30.0), queue_max_size (256), drain_timeout_s (5.0), failure_threshold_pct (1.0). Telemetry spans/counters are emitted under khora.integrations.hermes.* and khora.hermes.* (no namespace_id labels, free text is hashed).

API reference

The stable public surface: construction, remember/recall/forget, and result types.

Integrations overview

The full adapter lineup and the shared registry.

Getting started

Concepts

Operations

Experimental Features

Integrations

Reference

Examples

Install

Wiring it in

Namespace mapping

Tools

Threading model

Key knobs

API reference

Integrations overview

​Install

​Wiring it in

​Namespace mapping

​Tools

​Threading model

​Key knobs

API reference

Integrations overview

Install

Wiring it in

Namespace mapping

Tools

Threading model

Key knobs