by query shape · vector

Vector search on OriginChain.

OriginChain runs a managed vector database on the same hash-keyed substrate that powers SQL, full-text, and graph. A vector index is a new key shape, not a new engine. Embeddings live under h(tenant · "vec" · table) ‖ id; the persisted HNSW graph lives under a sibling "vec_idx" domain. Same WAL, same checkpoints, same backups, same single-tenant instance. A row written once is visible to SQL, vector, and full-text the moment the WAL fsyncs.

The algorithm is Malkov & Yashunin 2016 — the same HNSW that backs every modern ANN engine. Index defaults match the paper: M=16, ef_construction=200. Each topk call picks one of two query modes — fast or high_recall — and the server tunes the search width accordingly. Brute-force is the small-table fallback (under 5k vectors) and the ground-truth oracle the recall tests score against.

Declaring vector fields, dimensions, metric, and index parameters happens on the manifest. See schemas → vector fields. To insert vectors, see insert → vector. The rest of this page is the query reference.

Three metrics.

metricformulawhen to pick

Cosine 1 - (a · b) / (‖a‖‖b‖) Default. L2 norm pre-computed at write time, query is one dot + two scalar muls + one div.

Dot -(a · b) When embeddings are already unit-normalised. Cheapest.

L2 ‖a - b‖² When magnitude carries signal.

Hits return score = -distance so larger means closer regardless of metric.

Examples.

put_vec

POST /v1/tenants/:t/vector/:table/put

curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/vector/products/put" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "id": "sku-9281",
    "embedding": [0.0124, -0.0883, 0.0451, /* ... 768 floats ... */],
    "metadata": { "category": "running-shoes", "price": 129.0 }
  }'

oc.vector("products").put(
    id="sku-9281",
    embedding=embedding_768d,                      # list[float] or np.ndarray(f32)
    metadata={"category": "running-shoes", "price": 129.0},
)

topk

POST /v1/tenants/:t/vector/:table/topk

curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/vector/products/topk" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query":  [0.011, -0.082, 0.046, /* ... 768 floats ... */],
    "k":      10,
    "metric": "cosine",
    "mode":   "high_recall"
  }'

hits = oc.vector("products").topk(
    query=q_768d,
    k=10,
    metric="cosine",
    mode="high_recall",     # "fast" | "high_recall" (default)
)
for h in hits:
    print(h.id, h.score, h.metadata)

filtered topk (post-filter, k×4 over-fetch)

curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/vector/products/topk" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query":  [/* 768 floats */],
    "k":      10,
    "metric": "cosine",
    "filter": { "category": "running-shoes" }
  }'

hits = oc.vector("products").topk(
    query=q_768d,
    k=10,
    filter={"category": "running-shoes"},   # exact-equality on metadata
)

Filtered topk does an HNSW search at k × 4 and post-filters on exact-equality metadata. Highly-selective filters (under ~25% match rate) may return fewer than k hits.

Two modes. Pick on the call.

Each topk request takes a mode field. Numbers below are from a deterministic 100k-vector benchmark on Thunder (D=128, M=16, ef_construction=200, k=10, 1000 queries), with f32 SIMD distance kernels and the graph cache hot.

moderecall@10 at 100kp99 query latency

fast 0.69 37 ms

high_recall 0.96 109 ms

When to pick which.

fast — RAG with a re-ranker · hot dashboards · agent inner loops where latency dominates.
high_recall — Default. First-pass retrieval correctness matters — product search, similar-customer lookup, citation retrieval.

high_recall is the default — omit mode and you get the 0.96 / 109 ms point on the curve. Use fast when a downstream re-ranker or a tight latency SLO makes 37 ms the headline number that matters.

Atomic embedding+graph writes: put_vec ships the embedding payload and the mutated graph blob in a single WriteOp::Put batch — on crash either both replay or neither. The index can never lag the data.