The four operations
| Operation | What it does |
|---|---|
remember | Send content. Deyta Platform chunks it, embeds it, extracts entities, and stores everything. |
recall | Send a query. Get back ranked context — chunks, entities, and a pre-formatted string ready for an LLM prompt. |
forget | Send a document ID. Get back whether the document was deleted or not. |
ask | Send a query. Get back a synthesized answer with cited sources, generated from memories. |
remember whenever something happens, recall when you need raw context, forget when a memory is no longer valid and needs to be removed, and ask when you want a finished answer.
The five abstractions
Namespace — the tenancy boundary. Every operation runs against one namespace. Use them per tenant, project, or end user, depending on your needs. If you need to query memories together, they need to belong in the same namespace. Memory — the unit of content. A memory is a piece of text plus metadata. After ingestion it’s chunked, indexed for both vector and keyword search, and woven into the namespace’s knowledge graph as entities and relationships. Integration — a connector to the specific source of data. The catalog includes Google products (Drive, Gmail, Calendar), GitHub, Attio, Granola, and more. An admin enables which integrations the organization can use; an end user then authorizes a specific account via OAuth, which becomes a connection. Object stores (S3) and data warehouses (Snowflake) are configured separately separately, but they are also considered integrations. Connection — a live link from a specific data source to a specific namespace. Configure it once and content flows in continuously through the sameremember pipeline. One integration supports many connections — Google Drive is the integration; your specific flow of data from configured Google Drive to a namespace is a connection.
Ontology — the extraction schema. Defines which entity and relationship types should be pulled out of content during ingestion.
What’s in a memory
When you callremember, Deyta Platform runs your content through a multi-step pipeline before it’s queryable. Each step has a clear job:
Stage
Compute a content checksum and check whether this exact content was ingested before. If it was, the rest of the pipeline is skipped — duplicate ingestion is cheap and idempotent at the content level.
Chunk
Split the content into coherent passages. The default is semantic chunking, which cuts on natural sentence and paragraph boundaries. If you need custom chunking, you can chunk the document yourself before sending it to Deyta.
Embed
Each chunk is encoded into a vector that captures its semantic meaning. Vectors for similar concepts point in similar directions, even when the words differ. Entity embeddings are computed too, in the next step’s results.
Extract
An LLM reads the most informative chunks and pulls out entities (people, organizations, concepts, locations, plus any custom types from your ontology) and the relationships between them. Selective extraction keeps cost bounded — chunks that score low on importance skip the LLM call.
Store
The document, chunks, and metadata are persisted alongside a vector index over chunk embeddings and a keyword index for full-text search. Entities and relationships land in the namespace’s knowledge graph; entity embeddings let you find graph nodes by similarity, not just exact name match.
Expand
Entities mentioned across multiple memories with different surface forms — “Microsoft”, “Microsoft Corporation”, “MSFT” — are unified into a single graph node, with all sources merged. New relationships can be inferred from existing edges (if Alice and Bob both work for Acme, they’re colleagues).
remember call is a single document, broken into chunks indexed for vector and keyword search, plus entities and relationships that join — and may merge with — the namespace’s knowledge graph. Subsequent remember calls don’t just append memories; they grow and refine the graph that all of them are queried against.
How retrieval works
recall runs a hybrid search across three channels and fuses the results:
| Channel | What it finds |
|---|---|
| Vector | Chunks whose embeddings are similar to the query embedding |
| Graph | Entities and relationships connected to terms in the query |
| Keyword | Chunks whose text matches the literal query terms |
context_text you can drop straight into a prompt.
You can override the default behavior per-query with the mode parameter (vector, graph, hybrid, all).
Where Deyta Platform fits
Deyta Platform is the memory layer of your application. It sits between your raw content (chat logs, docs, support tickets, data warehouse rows) and your LLM:What’s next
Memories
A closer look at what a memory is and what comes out of ingestion.
Data flow
End-to-end paths for ingestion, retrieval, and synthesized answers.