Ingestion path
When you callremember:
Gateway receives the request
The gateway authenticates your API key, resolves the namespace (by
namespace_id or external_reference_id), and writes an audit log entry.Three-phase pipeline runs
The pipeline chunks the content, embeds each chunk into a vector, and runs an LLM extractor over the chunks to pull out entities and relationships.
Storage is updated atomically
The document, chunks, entities, and relationships are persisted together. Vectors land in the vector index; entities and relationships land in the knowledge graph; full-text indexes land in the keyword store.
remember doesn’t return until everything is durable. For high-volume ingestion, prefer the SDK’s batch helpers or a background worker.
Retrieval path
When you callrecall:
Query understanding
The query is optionally rewritten via HyDE (Hypothetical Document Embeddings) and analyzed for temporal hints. Pass
start_time/end_time explicitly to skip the natural-language detection.Hybrid search runs across three channels
Vector similarity, graph traversal, and keyword matching execute in parallel. Each returns a ranked list of candidates.
Fusion + reranking
Reciprocal Rank Fusion merges the three lists into one. A cross-encoder reranks the top candidates for relevance.
mode="vector" skips graph and keyword. raw=true (in the SDK) skips query understanding and reranking.
Synthesis path (ask)
ask is recall plus an LLM synthesis step:
Recall internally
The same hybrid retrieval runs, gated by the
config parameters (min_recall_limit, max_recall_limit, total_tokens_limit).LLM synthesizes an answer
The retrieved chunks are passed to an LLM with the user query. The model produces a written answer and a list of citations pointing back to the source memories.
ask costs more (LLM tokens for synthesis) and is slower than recall. Use it when you want the finished answer; use recall when you want raw context for your own prompt.
Forget path
When you callforget:
Document and derived state are removed atomically
The document, all its chunks, and any entity mentions or relationships sourced from this document are deleted in one transaction. Entities that were also mentioned by other memories survive — only the link to this document is severed.
Indexes are updated
The vector, graph, and keyword indexes drop the removed chunks. Future
recall and ask calls will not return them.forget is destructive and irreversible — there is no soft-delete. Take an export first if you might need the content later.
Continuous ingestion via integrations
Memories don’t have to be sent one at a time. Connect a data source via the Integrations flow and content flows in continuously:Start a connect session
Your application calls
POST /integrations/connections/start with a provider key (e.g., google_drive). Deyta Platform returns a session token and OAuth redirect URL.User authorizes via OAuth
Use the
@nangohq/frontend SDK to walk the user through provider authorization. The provider returns a token to your callback.Complete the session
Call
POST /integrations/connections/complete with the OAuth callback values. The connection is now live.Tenancy isolation
Every step in every path is scoped by namespace. Vectors, graph nodes, and keyword indexes for one namespace are stored separately from another. There is no cross-namespace retrieval — even within the same organization, a query againstns_a cannot return chunks from ns_b.
What’s next
Memories
What a memory is and what’s inside one after ingestion.
API Reference
The exact request and response shapes for every endpoint described here.