5. Agent memory (row + vector + FTS + metadata filter)
← Atomic multi-shape
Save one piece of agent memory three ways - the structured row in agent.memories, an embedding for "what did this agent learn that's similar to the current query", and a keyword index over the text. The vector put carries agent_id and session_id as metadata, so similarity search can be filtered to a single agent and session at retrieval time.
- ChatGPT-style long-term memory: before each turn, retrieve the top-k similar past memories for the current agent and session.
- Autonomous agents that need to recall prior tool calls, plans, or facts.
- Any system where the same text store is queried by both vector similarity (for recall) and keywords (for exact-phrase grep).
The composite index covers the most common read shape - "newest memories for this agent in this session".
# agent/memories.toml
namespace = "agent"
table = "memories"
primary_key = ["id"]
[[columns]]
name = "id"
ty = "str"
required = true
[[columns]]
name = "agent_id"
ty = "str"
required = true
[[columns]]
name = "session_id"
ty = "str"
required = true
[[columns]]
name = "text"
ty = "str"
required = true
[[columns]]
name = "ts_ms"
ty = "u64"
required = true
# Composite index supports "all memories for this agent in this session,
# newest first" without a full scan.
[[indexes]]
name = "by_agent_session_time"
columns = ["agent_id", "session_id", "ts_ms"] curl -X POST "$ORIGINCHAIN_URL/v1/tenants/$T/rows/agent.memories" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"id": "mem-77f2-001",
"agent_id": "agt-sales-east",
"session_id": "thr-77f2",
"text": "Customer prefers contact between 09:00 and 11:00 IST.",
"ts_ms": 1749500120000
}'db.rows.put("agent.memories", {
"id": "mem-77f2-001",
"agent_id": "agt-sales-east",
"session_id": "thr-77f2",
"text": "Customer prefers contact between 09:00 and 11:00 IST.",
"ts_ms": 1749500120000,
})// The TypeScript SDK does not wrap row writes yet
// (shipping in the next release). Use `fetch` for now.
await fetch(`${BASE_URL}/v1/tenants/${TENANT}/rows/agent.memories`, {
method: "POST",
headers: {
"Authorization": `Bearer ${OC_TOKEN}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
id: "mem-77f2-001",
agent_id: "agt-sales-east",
session_id: "thr-77f2",
text: "Customer prefers contact between 09:00 and 11:00 IST.",
ts_ms: 1749500120000,
}),
});// The Go SDK does not wrap row writes yet
// (shipping in the next release). Use net/http for now.
body, _ := json.Marshal(map[string]any{
"id": "mem-77f2-001",
"agent_id": "agt-sales-east",
"session_id": "thr-77f2",
"text": "Customer prefers contact between 09:00 and 11:00 IST.",
"ts_ms": uint64(1749500120000),
})
req, _ := http.NewRequestWithContext(ctx, "POST",
BASE_URL+"/v1/tenants/"+TENANT+"/rows/agent.memories",
bytes.NewReader(body))
req.Header.Set("Authorization", "Bearer "+OC_TOKEN)
req.Header.Set("Content-Type", "application/json")
resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close() agent_id and session_id go in metadata, not just in the row. Without them on the vector, you can't restrict /vector/topk to a single agent at search time, and one agent's memories leak into another's recall.
curl -X POST "$ORIGINCHAIN_URL/v1/tenants/$T/vector/agent.memories/put" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"id": "mem-77f2-001",
"embedding": [0.0144, -0.0681, 0.0398, /* ... 768 floats ... */],
"dim": 768,
"metric": "cosine",
"metadata": {
"agent_id": "agt-sales-east",
"session_id": "thr-77f2"
}
}'# Metadata enables "find similar memories for THIS agent in THIS session"
# at search time, without scanning every other agent's memories.
db.vector.put(
"agent.memories",
"mem-77f2-001",
embedding_768d,
metadata={
"agent_id": "agt-sales-east",
"session_id": "thr-77f2",
},
)await db.vectorPut("agent.memories", {
id: "mem-77f2-001",
embedding: embedding768d,
dim: 768,
metric: "cosine",
metadata: {
agent_id: "agt-sales-east",
session_id: "thr-77f2",
},
});err := db.VectorPut(ctx, "agent.memories", originchain.VectorPutRequest{
ID: "mem-77f2-001",
Embedding: embedding768d,
Dim: 768,
Metric: "cosine",
Metadata: map[string]any{
"agent_id": "agt-sales-east",
"session_id": "thr-77f2",
},
}) curl -X POST "$ORIGINCHAIN_URL/v1/tenants/$T/fts/agent.memories/index" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"field": "text",
"doc_id": "mem-77f2-001",
"text": "Customer prefers contact between 09:00 and 11:00 IST."
}'db.fts.index(
"agent.memories",
"text",
doc_id="mem-77f2-001",
text="Customer prefers contact between 09:00 and 11:00 IST.",
)await db.ftsIndex("agent.memories", {
field: "text",
docId: "mem-77f2-001",
text: "Customer prefers contact between 09:00 and 11:00 IST.",
});err := db.FTSIndex(ctx, "agent.memories", originchain.FTSIndexRequest{
Field: "text",
DocID: "mem-77f2-001",
Text: "Customer prefers contact between 09:00 and 11:00 IST.",
})
The three calls are separate. There is no single "write everything" endpoint. Each call is atomic by itself. The SDKs auto-attach an Idempotency-Key on every mutating call, so if the vector put fails after the row succeeded, retry just that one - re-doing the row write would not duplicate the memory.
- Not partitioning by
agent_id. If the vector put doesn't carry the agent metadata, everytopkranks across every agent's memories. One sales agent ends up "recalling" what a support agent learned about a different customer. - Storing huge memory blocks as one row. Memories should be sentence- or paragraph-sized. A whole conversation in one row gives mush at retrieval time. Split before writing.
- Embedding without normalizing first. If you compare a fresh embedding (model A, length-normalized) to a stored one (model B, raw), cosine scores are meaningless. Pin one model + one normalization step at write and read time.
- Never expiring old memories. Agent memory tables grow without bound. Build a sweeper that deletes by
ts_msbelow a threshold, and remember toDELETEthe vector and FTS doc too - row deletion does not fan out.