Overview

Deyta Platform gives your application four operations and five abstractions. That’s the whole interface.

The four operations

Operation	What it does
`remember`	Send content. Deyta Platform chunks it, embeds it, extracts entities, and stores everything.
`recall`	Send a query. Get back ranked context — chunks, entities, and a pre-formatted string ready for an LLM prompt.
`forget`	Send a document ID. Get back whether the document was deleted or not.
`ask`	Send a query. Get back a synthesized answer with cited sources, generated from memories.

Most applications use all four: remember whenever something happens, recall when you need raw context, forget when a memory is no longer valid and needs to be removed, and ask when you want a finished answer.

The five abstractions

Namespace — the tenancy boundary. Every operation runs against one namespace. Use them per tenant, project, or end user, depending on your needs. If you need to query memories together, they need to belong in the same namespace. Memory — the unit of content. A memory is a piece of text plus metadata. After ingestion it’s chunked, indexed for both vector and keyword search, and woven into the namespace’s knowledge graph as entities and relationships. Integration — a connector to the specific source of data. The catalog includes Google products (Drive, Gmail, Calendar), GitHub, Attio, Granola, and more. An admin enables which integrations the organization can use; an end user then authorizes a specific account via OAuth, which becomes a connection. Object stores (S3) and data warehouses (Snowflake) are configured separately separately, but they are also considered integrations. Connection — a live link from a specific data source to a specific namespace. Configure it once and content flows in continuously through the same remember pipeline. One integration supports many connections — Google Drive is the integration; your specific flow of data from configured Google Drive to a namespace is a connection. Ontology — the extraction schema. Defines which entity and relationship types should be pulled out of content during ingestion.

What’s in a memory

When you call remember, Deyta Platform runs your content through a multi-step pipeline before it’s queryable. Each step has a clear job:

Stage

Compute a content checksum and check whether this exact content was ingested before. If it was, the rest of the pipeline is skipped — duplicate ingestion is cheap and idempotent at the content level.

Chunk

Split the content into coherent passages. The default is semantic chunking, which cuts on natural sentence and paragraph boundaries. If you need custom chunking, you can chunk the document yourself before sending it to Deyta.

Embed

Each chunk is encoded into a vector that captures its semantic meaning. Vectors for similar concepts point in similar directions, even when the words differ. Entity embeddings are computed too, in the next step’s results.

Extract

An LLM reads the most informative chunks and pulls out entities (people, organizations, concepts, locations, plus any custom types from your ontology) and the relationships between them. Selective extraction keeps cost bounded — chunks that score low on importance skip the LLM call.

Store

The document, chunks, and metadata are persisted alongside a vector index over chunk embeddings and a keyword index for full-text search. Entities and relationships land in the namespace’s knowledge graph; entity embeddings let you find graph nodes by similarity, not just exact name match.

Expand

Entities mentioned across multiple memories with different surface forms — “Microsoft”, “Microsoft Corporation”, “MSFT” — are unified into a single graph node, with all sources merged. New relationships can be inferred from existing edges (if Alice and Bob both work for Acme, they’re colleagues).

The result of one remember call is a single document, broken into chunks indexed for vector and keyword search, plus entities and relationships that join — and may merge with — the namespace’s knowledge graph. Subsequent remember calls don’t just append memories; they grow and refine the graph that all of them are queried against.

How retrieval works

recall runs a hybrid search across three channels and fuses the results:

Channel	What it finds
Vector	Chunks whose embeddings are similar to the query embedding
Graph	Entities and relationships connected to terms in the query
Keyword	Chunks whose text matches the literal query terms

Each channel produces a ranked list. Reciprocal Rank Fusion merges them into one. A cross-encoder reranks the top candidates. You get back the strongest matches, plus a pre-formatted context_text you can drop straight into a prompt. You can override the default behavior per-query with the mode parameter (vector, graph, hybrid, all).

Where Deyta Platform fits

Deyta Platform is the memory layer of your application. It sits between your raw content (chat logs, docs, support tickets, data warehouse rows) and your LLM:

[your sources] → Deyta Platform → [your LLM]
                  ↑   ↓
              remember/ask/recall

You don’t have to think about vector databases, graph databases, embeddings, or chunking strategies — they’re all behind the API. You think in terms of memories and namespaces.

Getting started

Concepts

Console

Guides

Identities

The four operations

The five abstractions

What’s in a memory

How retrieval works

Where Deyta Platform fits

What’s next

Memories

Data flow

Getting started

Concepts

Console

Guides

Identities

​The four operations

​The five abstractions

​What’s in a memory

​How retrieval works

​Where Deyta Platform fits

​What’s next

Memories

Data flow

The four operations

The five abstractions

What’s in a memory

How retrieval works

Where Deyta Platform fits

What’s next