Querying memories

Deyta Platform gives you two ways to read from a namespace:

Operation	Returns
`recall`	Ranked chunks and entities — raw context for your own prompt.
`ask`	A synthesized answer with cited sources, generated from your memories.

Use recall when your application is responsible for the final prompt or rendering. Use ask when you want a finished answer.

Recall

const result = await deyta.memory.recall({
  namespace_id: ns.id,
  query: "what do we know about the project?",
  limit: 10,
});

// Drop straight into a prompt
const prompt = `Context:\n${result.context_text}\n\nQuestion: …`;

// Or inspect raw matches
for (const match of result.results) {
  console.log(match.score, match.chunk.content);
}

The response includes:

A pre-formatted context_text ready to paste into an LLM prompt
A list of ranked chunks with similarity scores
Related entities and relationships from the knowledge graph
The original query and namespace_id

Search modes

mode controls which retrieval channels run.

await deyta.memory.recall({
  namespace_id: ns.id,
  query: "deployment incidents in Q1",
  mode: "hybrid",
});

Mode	What runs	When to use
`vector`	Vector similarity only	Fastest. Good for short, semantic queries.
`graph`	Graph traversal only	When the query references named entities and you want connected information.
`hybrid` (default)	Vector + graph + keyword, fused via Reciprocal Rank Fusion	Best general-purpose. Use this unless you have a reason not to.
`all`	Every available channel	Slower, returns more context. Use when you want maximum coverage.

Tuning recall

Parameter	Effect
`limit`	Maximum number of chunks to return. Default `10`.
`mode`	Which retrieval channels run. See above.

For finer control over fusion weights, recency bias, HyDE expansion, and reranking, configure the org-level defaults in the console rather than per-query.

Temporal queries

Pass explicit time bounds to scope retrieval by event time. This bypasses the natural-language temporal detection that would otherwise parse phrases like “last week” out of the query text.

TypeScript SDK
curl

const result = await deyta.memory.recall({
  namespace_id: ns.id,
  query: "deployment incidents",
  from: new Date("2026-01-01T00:00:00Z"),
  until: new Date("2026-03-31T23:59:59Z"),
});

The SDK accepts either a Date or an ISO-8601 string and serializes appropriately.

curl https://api.deyta.ai/gateway/v1/recall \
  -H "Authorization: Bearer $DEYTA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "namespace_id": "ns_…",
    "query": "deployment incidents",
    "start_time": "2026-01-01T00:00:00Z",
    "end_time": "2026-03-31T23:59:59Z"
  }'

You can pass either or both bounds:

Bounds	Effect
Both set	Closed window — only memories with event time in `[start_time, end_time]`.
`start_time` only	Half-open forward — everything from that timestamp onward.
`end_time` only	Half-open backward — everything up to that timestamp.
Neither	No temporal filter; natural-language detection on the query text still applies.

Passing an explicit time bound disables natural-language temporal detection on the query, so phrases like “yesterday” in the query string are ignored. Use one or the other, not both.

Ask

ask runs the same hybrid retrieval as recall, then passes the results to an LLM that produces a written answer with cited sources.

const answer = await deyta.memory.ask({
  namespace_id: ns.id,
  query: "What are the key project milestones?",
});

console.log(answer.answer);
console.log(answer.sources);

Tuning synthesis

await deyta.memory.ask({
  namespace_id: ns.id,
  query: "summarize the latest deployment incidents",
  config: {
    min_recall_limit: 3,
    max_recall_limit: 20,
    total_tokens_limit: 4000,
    enabled_tools: ["memory_recall", "entity_search"],
  },
  from: new Date("2026-04-01T00:00:00Z"),
  until: new Date("2026-04-30T23:59:59Z"),
});

Field	Effect
`min_recall_limit`	Minimum number of memories to recall before synthesizing.
`max_recall_limit`	Maximum number of memories to recall.
`total_tokens_limit`	Maximum total tokens to use for synthesis.
`enabled_tools`	Which retrieval tools the synthesizer is allowed to use. Common values: `memory_recall`, `entity_search`, `entity_explore`, `web_search`.

from / until work the same way as on recall — they cap the time window for any internal recall the synthesizer issues.

Recall vs ask: when to use which

Use recall when

You control the final prompt. You want the cheapest, fastest read. You’re combining Deyta Platform context with other context. You need raw access to chunks for inspection or display.

Use ask when

You want a finished answer. You’re building a chat or Q&A surface where the user expects narrative output. You’re OK with the extra LLM cost and latency.

What’s next

Managing memories

Forget, audit, and inspect what’s in a namespace.

API Reference

Full request and response schemas for recall and ask.

Getting started

Concepts

Console

Guides

Identities

Querying memories

Recall

Search modes

Tuning recall

Temporal queries

Ask

Tuning synthesis

Recall vs ask: when to use which

Use recall when

Use ask when

What’s next

Managing memories

API Reference

Getting started

Concepts

Console

Guides

Identities

​Recall

​Search modes

​Tuning recall

​Temporal queries

​Ask

​Tuning synthesis

​Recall vs ask: when to use which

Use recall when

Use ask when

​What’s next

Managing memories

API Reference

Recall

Search modes

Tuning recall

Temporal queries

Ask

Tuning synthesis

Recall vs ask: when to use which

What’s next