examples · vector · 3 / 7
3. Top-k - cosine similarity
← Vector exampleswhat this does
Send a query vector, get back the k nearest stored vectors ranked by cosine similarity. Cosine ignores vector length and only compares direction - this is the right default for text embeddings, where models like OpenAI text-embedding-3 output direction-bearing vectors.
when to use it
- Semantic search over text - product descriptions, chat history, docs.
- RAG retrieval before an LLM call.
- Any embedding model whose output isn't already unit-normalised - cosine handles normalisation for you.
the request
POST /v1/tenants/:t/vector/:table/topk
curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/vector/shop.products/topk" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": [0.0124, -0.0883, 0.0451, /* ... 768 floats ... */],
"k": 10,
"dim": 768,
"metric": "cosine"
}'hits = db.vector.topk(
"shop.products",
query=query_768d, # list[float] of length 768
k=10,
metric="cosine",
)const hits = await db.vectorTopk("shop.products", {
query: query768d, // number[] of length 768
k: 10,
dim: 768,
metric: "cosine",
});hits, err := db.VectorTopK(ctx, "shop.products", originchain.VectorTopKRequest{
Query: query768d, // []float32 of length 768
K: 10,
Dim: 768,
Metric: "cosine",
}) what you get back
{
"hits": [
{ "id": "sku-9281", "score": 0.9421 },
{ "id": "sku-1144", "score": 0.9187 },
{ "id": "sku-5520", "score": 0.8903 }
/* ... up to k entries ... */
]
} score is cosine similarity in [-1, 1]. Higher = closer. 1.0 is identical direction, 0 is orthogonal, -1 is opposite.
request fields
| Field | Required | Notes |
|---|---|---|
| query | yes | Array of floats. Length must equal the table's locked dim. |
| k | yes | How many hits to return. Typical: 5 - 50. |
| dim | yes | Must match the table's locked dim. |
| metric | no | Default "cosine". Must match the metric the table was put with. |
common mistakes
- Cosine similarity vs cosine distance. Some libraries return
1 - similarity(a distance, lower = closer). OriginChain returns similarity directly - higher = closer. Don't sort the wrong way. - Metric mismatch. If the table was put with
"l2"you cannot topk it with"cosine". The request returns400 metric_mismatch. - Query from a different model. Embeddings from
text-embedding-3are not comparable to embeddings fromall-MiniLM. The same dim does not mean the same vector space.