Core APIs

The examples/10_core_apis/ tier picks up where the Basics leave off. Each example isolates one API surface and shows the knobs that matter. So when you reach for remember_batch, recall(...) filters, or the entity-graph reads, you already know which argument does what.

Most examples here run on the zero-infra embedded backend (pip install "khora[sqlite-lance]" + OPENAI_API_KEY), exactly like the basics. Reading entities & relationships back and Exploring the graph default to PostgreSQL + Neo4j. Bring the stack up with make dev from the Khora repo first. They read the entity graph back, and on the embedded backend entity vectors aren’t written to LanceDB yet, so two of their steps would come back empty. Both still accept --config examples/khora.embedded.yaml if you want to run degraded.

Batch ingestion with `remember_batch`

A kb.remember() loop is fine for a handful of records but pays three avoidable taxes at scale: the embedder cache re-warms per call, entity dedup runs per-document with no cross-doc scope, and concurrency is whatever your caller wired up. remember_batch fixes all three: one shared embedder cache, cross-document dedup via EntityIndex, and a max_concurrent ceiling on in-flight LLM calls (≈5–10 on a laptop, higher on the standard stack).

result = await kb.remember_batch(
    docs,                       # list of {"content", "title", "source"} dicts
    namespace=ns_id,
    max_concurrent=5,           # docs in flight against the LLM at once
    on_progress=on_progress,
    entity_types=["CONCEPT", "EVENT"],
    relationship_types=["RELATES_TO"],
)
print(f"{result.processed} processed, {result.skipped} skipped, {result.failed} failed")
print(f"chunks={result.chunks}, entities={result.entities}, relationships={result.relationships}")

Takeaway: the returned BatchResult carries processed / skipped / failed counts plus rolled-up chunks / entities / relationships totals.

Recall with filters

recall() exposes three arguments that change the result set itself, not just its ordering. Knowing them removes most application-side post-filtering:

limit: cap the response at the engine level

Cheaper than asking for 100 chunks and trimming to 5 in Python.

result = await kb.recall("ingest pipeline work", namespace=ns_id, limit=2)

min_similarity: a real semantic-quality cutoff

Drops chunks whose raw cosine is below the threshold. It operates before score normalization, so it’s a genuine quality gate, unlike thresholding the post-normalize chunk.score.

result = await kb.recall(
    "memory leak in ingestion service",
    namespace=ns_id,
    limit=10,
    min_similarity=0.4,
)

mode: choose the retrieval channels

SearchMode.{VECTOR, GRAPH, HYBRID, ALL, KEYWORD}. HYBRID (the default) fuses vector + graph; the BM25 keyword channel joins only when enable_bm25_channel is on. KEYWORD is BM25-only lexical match (always runs BM25, regardless of that flag).

from khora.query.engine import SearchMode

result = await kb.recall(query, namespace=ns_id, limit=3, mode=SearchMode.KEYWORD)

Why this matters: VectorCypher fuses the vector and graph channels by default, so mode lets you narrow to a single channel. Add the keyword channel with enable_bm25_channel, or use SearchMode.KEYWORD for pure lexical match.

Ontology config

entity_types and relationship_types are required kwargs on every remember(), but they’re guidance, not a hard schema. The extractor treats your list as a strong hint (it can still emit types outside it), and passing empty lists doesn’t turn extraction off. It removes the guidance entirely, so Khora falls back to unbounded extraction and infers its own taxonomy from the content, with loosely-cased model-chosen labels (you’ll see Person and EVENT side by side). So pass them deliberately. This example runs the same paragraph through two ontologies and compares the resulting entity-type histograms:

# Generic — blunt, catch-all labels you must disambiguate downstream
entity_types=["PERSON", "ORGANIZATION", "CONCEPT", "LOCATION", "EVENT", "PRODUCT", "TECHNOLOGY"]
relationship_types=["RELATES_TO", "PART_OF", "MENTIONS"]

# Domain-specific — typed the way a recruiter actually thinks
entity_types=["CANDIDATE", "EMPLOYER", "ROLE", "SKILL", "HIRING_MANAGER", "INTERVIEW_LOOP"]
relationship_types=["APPLIED_TO", "WORKED_AT", "HAS_SKILL", "SCHEDULED_FOR", "MANAGED_BY"]

# Inspect what was extracted, either way:
entities = await kb.list_entities(namespace=ns_id, limit=50)
histogram = Counter(e.entity_type for e in entities)

Takeaway: same model, same text. But the domain-specific run yields an entity-type histogram you can query against. Picking the ontology is the single highest-leverage extraction decision you make. This is the groundwork for the iterative-ontology workflow planned in Tier 40 (open extraction → inspect → refine → re-extract). When you need the system prompt, dedup, and inference rules versioned alongside the types, package them as an ExpertiseConfig. See 06 below and the Expertise & ontologies guide.

Reading entities & relationships back

VectorCypher builds a graph at write time (one LLM call per remember produces entities + relationships keyed back to the source chunks). This example reads that graph back three ways:

recall = await kb.recall("Marie Curie discoveries and family", namespace=ns_id, limit=5)

recall.chunks          # [1] textual evidence
recall.entities        # [2] extracted nodes, inline on the projection
recall.relationships   # [3] edges between them, scored

# The dedicated reads — these work on BOTH backends:
all_entities = await kb.list_entities(namespace=ns_id, limit=20)
neighbours = await kb.find_related_entities(marie.id, namespace=ns_id, max_depth=1, limit=10)

Takeaway: recall().entities / .relationships give you the graph inline with retrieval. list_entities and find_related_entities are the explicit “show me the graph” reads. On embedded sqlite_lance the inline lists come back empty. The example detects this and falls through to the list/traverse APIs, which is why it defaults to PostgreSQL + Neo4j.

Exploring the graph

The widest tour in the tier, seven reads against the same Curie-family graph, so you can pick the right call for the task:

list_entities(namespace): enumerate everything
list_entities(entity_type="PERSON"): filter to one type
search_entities(query, namespace): semantic lookup by name/description
get_entity(entity_id, namespace): fetch one node by id
find_related_entities(entity_id, max_depth=…): walk the edges outward
recall(query, mode=SearchMode.GRAPH): graph-channel retrieval
An ASCII tree built from (1) + (5)

persons = await kb.list_entities(namespace=ns_id, entity_type="PERSON", limit=50)
hits    = await kb.search_entities("physicist who discovered radium", namespace=ns_id, limit=3)
node    = await kb.get_entity(marie.id, namespace=ns_id)
depth2  = await kb.find_related_entities(marie.id, namespace=ns_id, max_depth=2, limit=40)

Takeaway: list_entities / get_entity / find_related_entities read from the graph store and work on every backend. search_entities and recall(mode=GRAPH) need entity vectors, so they degrade to empty on embedded, hence the PostgreSQL + Neo4j default. max_depth is how you control how far a traversal fans out.

Ontology as an `ExpertiseConfig`

Example 03 used bare entity_types / relationship_types lists. When you outgrow that, an ExpertiseConfig packages the whole recruiting domain into one reusable, versioned object: a system prompt, typed entities with identifiers (for cross-source dedup), typed relationships, a correlation rule (merge the same candidate seen in two notes), and an inference rule that derives edges the text never stated:

from khora import ExpertiseConfig, EntityTypeConfig, RelationshipTypeConfig
from khora.extraction.skills import CorrelationRule, InferenceRule, InferenceCondition

ontology = ExpertiseConfig(
    name="recruiting",
    version="1.0.0",
    system_prompt=(
        "Extract recruiting info from hiring notes. People who applied are "
        "CANDIDATE, companies are EMPLOYER, open positions are ROLE, technical "
        "skills are SKILL, and the person running the process is HIRING_MANAGER."
    ),
    entity_types=[
        EntityTypeConfig(name="CANDIDATE", identifiers=["email", "name"], aliases=["APPLICANT"]),
        EntityTypeConfig(name="EMPLOYER", identifiers=["name"], aliases=["COMPANY"]),
        EntityTypeConfig(name="ROLE", identifiers=["name"]),
        EntityTypeConfig(name="SKILL", identifiers=["name"]),
        EntityTypeConfig(name="HIRING_MANAGER", identifiers=["name"]),
    ],
    relationship_types=[
        RelationshipTypeConfig(name="APPLIED_TO", source_types=["CANDIDATE"], target_types=["ROLE"]),
        RelationshipTypeConfig(name="ROLE_AT",    source_types=["ROLE"],      target_types=["EMPLOYER"]),
        RelationshipTypeConfig(name="HAS_SKILL",  source_types=["CANDIDATE"], target_types=["SKILL"]),
        RelationshipTypeConfig(name="MANAGED_BY", source_types=["ROLE"],      target_types=["HIRING_MANAGER"]),
        RelationshipTypeConfig(name="TARGETS",    source_types=["CANDIDATE"], target_types=["EMPLOYER"]),
    ],
    # Correlation: the same candidate seen in two notes collapses to one node.
    correlation_rules=[
        CorrelationRule(name="dedupe_candidates", entity_types=["CANDIDATE"],
                        match_fields=["email", "name"], confidence=0.85),
    ],
    # Inference: APPLIED_TO a ROLE that is ROLE_AT an EMPLOYER -> CANDIDATE TARGETS EMPLOYER.
    inference_rules=[
        InferenceRule(
            name="candidate_targets_employer",
            when=[InferenceCondition(relationship="APPLIED_TO", source_type="CANDIDATE", target_type="ROLE"),
                  InferenceCondition(relationship="ROLE_AT",    source_type="ROLE",      target_type="EMPLOYER")],
            then_relationship="TARGETS", then_source="first.source", then_target="second.target",
            confidence=0.6),
    ],
)

await kb.remember(text, namespace=ns_id, expertise=ontology,
                  entity_types=ontology.get_entity_type_names(),        # the kwargs are still required
                  relationship_types=ontology.get_relationship_type_names())

A short hiring note, driven by the ontology object. Entities come back typed the way the ontology declares (one run, LLM output varies). The example also turns event extraction off (VectorCypherConfig(store_events=False)), so no EVENT / PARTICIPATED_IN appear. The ASSOCIATED_WITH edges are co-occurrence links Khora adds and aren’t configurable on VectorCypher. See Ingestion for both knobs:

entity-type histogram (driven by the ExpertiseConfig):
  CANDIDATE        1    Priya Patel
  EMPLOYER         1    Acme Robotics
  ROLE             1    backend engineering
  SKILL            1    Python
  HIRING_MANAGER   1    Sam Chen

relationships: APPLIED_TO · ROLE_AT · HAS_SKILL · MANAGED_BY  (+ ASSOCIATED_WITH co-occurrence)

Takeaway: bare lists are a per-call hint. An ExpertiseConfig is the ontology as a first-class artifact: define the prompt, identifiers (dedup), correlation rules, and inference rules once, version them, and reuse them across every remember(). The APPLIED_TO + ROLE_AT ⇒ TARGETS inference rule is the kind expansion uses to derive candidate→employer edges no chunk states. Full guide: Expertise & ontologies.

Next steps

Workloads

End-to-end scenarios that compose these APIs into real applications.

VectorCypher

The engine behind these examples: hybrid vector + graph retrieval, with an opt-in BM25 keyword channel.

Getting started

Concepts

Operations

Experimental Features

Integrations

Reference

Examples

Batch ingestion with `remember_batch`

Recall with filters

Ontology config

Reading entities & relationships back

Exploring the graph

Ontology as an `ExpertiseConfig`

Next steps

Workloads

VectorCypher

​Batch ingestion with remember_batch

​Recall with filters

​Ontology config

​Reading entities & relationships back

​Exploring the graph

​Ontology as an ExpertiseConfig

​Next steps

Workloads

VectorCypher

Batch ingestion with `remember_batch`

Recall with filters

Ontology config

Reading entities & relationships back

Exploring the graph

Ontology as an `ExpertiseConfig`

Next steps