OriginChain docs
examples · atomic · 5 / 5

5. Agent memory (row + vector + FTS + metadata filter)

← Atomic multi-shape
what this does

Save one piece of agent memory three ways - the structured row in agent.memories, an embedding for "what did this agent learn that's similar to the current query", and a keyword index over the text. The vector put carries agent_id and session_id as metadata, so similarity search can be filtered to a single agent and session at retrieval time.

when to use it
  • ChatGPT-style long-term memory: before each turn, retrieve the top-k similar past memories for the current agent and session.
  • Autonomous agents that need to recall prior tool calls, plans, or facts.
  • Any system where the same text store is queried by both vector similarity (for recall) and keywords (for exact-phrase grep).
the schema

The composite index covers the most common read shape - "newest memories for this agent in this session".

# agent/memories.toml
namespace   = "agent"
table       = "memories"
primary_key = ["id"]

[[columns]]
name = "id"
ty   = "str"
required = true

[[columns]]
name = "agent_id"
ty   = "str"
required = true

[[columns]]
name = "session_id"
ty   = "str"
required = true

[[columns]]
name = "text"
ty   = "str"
required = true

[[columns]]
name = "ts_ms"
ty   = "u64"
required = true

# Composite index supports "all memories for this agent in this session,
# newest first" without a full scan.
[[indexes]]
name    = "by_agent_session_time"
columns = ["agent_id", "session_id", "ts_ms"]
call 1 of 3 - the memory row
POST /v1/tenants/:t/rows/agent.memories
curl -X POST "$ORIGINCHAIN_URL/v1/tenants/$T/rows/agent.memories" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "id":         "mem-77f2-001",
    "agent_id":   "agt-sales-east",
    "session_id": "thr-77f2",
    "text":       "Customer prefers contact between 09:00 and 11:00 IST.",
    "ts_ms":      1749500120000
  }'
call 2 of 3 - the embedding with metadata

agent_id and session_id go in metadata, not just in the row. Without them on the vector, you can't restrict /vector/topk to a single agent at search time, and one agent's memories leak into another's recall.

POST /v1/tenants/:t/vector/agent.memories/put
curl -X POST "$ORIGINCHAIN_URL/v1/tenants/$T/vector/agent.memories/put" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "id":        "mem-77f2-001",
    "embedding": [0.0144, -0.0681, 0.0398, /* ... 768 floats ... */],
    "dim":       768,
    "metric":    "cosine",
    "metadata": {
      "agent_id":   "agt-sales-east",
      "session_id": "thr-77f2"
    }
  }'
call 3 of 3 - the keyword index
POST /v1/tenants/:t/fts/agent.memories/index
curl -X POST "$ORIGINCHAIN_URL/v1/tenants/$T/fts/agent.memories/index" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "field":  "text",
    "doc_id": "mem-77f2-001",
    "text":   "Customer prefers contact between 09:00 and 11:00 IST."
  }'
about atomicity

The three calls are separate. There is no single "write everything" endpoint. Each call is atomic by itself. The SDKs auto-attach an Idempotency-Key on every mutating call, so if the vector put fails after the row succeeded, retry just that one - re-doing the row write would not duplicate the memory.

common mistakes
  • Not partitioning by agent_id. If the vector put doesn't carry the agent metadata, every topk ranks across every agent's memories. One sales agent ends up "recalling" what a support agent learned about a different customer.
  • Storing huge memory blocks as one row. Memories should be sentence- or paragraph-sized. A whole conversation in one row gives mush at retrieval time. Split before writing.
  • Embedding without normalizing first. If you compare a fresh embedding (model A, length-normalized) to a stored one (model B, raw), cosine scores are meaningless. Pin one model + one normalization step at write and read time.
  • Never expiring old memories. Agent memory tables grow without bound. Build a sweeper that deletes by ts_ms below a threshold, and remember to DELETE the vector and FTS doc too - row deletion does not fan out.