examples · fts · 6 / 6
6. Multi-field weighted merge
← FTS exampleswhat this does
Searches the same query across two fields - typically a short, high-signal field like title and a longer body field like description - and produces a single ranked list. Each field is an independent inverted index, so you make two queries and combine the scores in your app with whatever weights you want (e.g. 2.0 × title + 1.0 × description).
when to use it
- Catalogue / product search where a hit in the title is worth more than a hit in the body.
- Help-centre search across
heading+body. - Anywhere you'd reach for "field boosts" in a traditional search engine.
the request
Two POSTs to set up (one per field), two GETs to query, one client-side merge step. The doc_id on both fields is the same row PK so the merge can join on it.
POST /fts/:schema/:field (x2), GET /fts/:schema/:field (x2), then merge
# Step 1: index the same row into two separate fields
curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/title" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "doc_id": "p001", "text": "Wireless Headphones - Pro" }'
curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/description" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "doc_id": "p001", "text": "Over-ear wireless headphones with noise cancellation." }'
# Step 2: query each field, then merge client-side
curl -G "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/title" \
-H "Authorization: Bearer $OC_TOKEN" \
--data-urlencode "q=wireless headphones" --data-urlencode "mode=bm25" --data-urlencode "k=20"
curl -G "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/description" \
-H "Authorization: Bearer $OC_TOKEN" \
--data-urlencode "q=wireless headphones" --data-urlencode "mode=bm25" --data-urlencode "k=20"# Index the row's title and description as two separate FTS docs
db.fts.index("shop.products", "title", doc_id="p001", text="Wireless Headphones - Pro")
db.fts.index("shop.products", "description", doc_id="p001", text="Over-ear wireless headphones with noise cancellation.")
# Query each field independently
title_hits = db.fts.search("shop.products", "title", q="wireless headphones", mode="bm25", k=20)
desc_hits = db.fts.search("shop.products", "description", q="wireless headphones", mode="bm25", k=20)
# Merge client-side with your own weights
def merge(title_hits, desc_hits, w_title=2.0, w_desc=1.0):
scores: dict[str, float] = {}
for h in title_hits.hits:
scores[h["doc_id"]] = scores.get(h["doc_id"], 0.0) + w_title * h["score"]
for h in desc_hits.hits:
scores[h["doc_id"]] = scores.get(h["doc_id"], 0.0) + w_desc * h["score"]
return sorted(scores.items(), key=lambda kv: kv[1], reverse=True)
for doc_id, score in merge(title_hits, desc_hits)[:10]:
print(doc_id, score)// Index the row's title and description as two separate FTS docs
await db.ftsIndex("shop.products", "title", { doc_id: "p001", text: "Wireless Headphones - Pro" });
await db.ftsIndex("shop.products", "description", { doc_id: "p001", text: "Over-ear wireless headphones with noise cancellation." });
// Query each field independently
const titleHits = await db.ftsSearch("shop.products", "title", { q: "wireless headphones", mode: "bm25", k: 20 });
const descHits = await db.ftsSearch("shop.products", "description", { q: "wireless headphones", mode: "bm25", k: 20 });
// Merge client-side with your own weights
function merge(
title: typeof titleHits,
desc: typeof descHits,
wTitle = 2.0,
wDesc = 1.0,
) {
const scores = new Map<string, number>();
for (const h of title.hits) scores.set(h.doc_id, (scores.get(h.doc_id) ?? 0) + wTitle * h.score);
for (const h of desc.hits) scores.set(h.doc_id, (scores.get(h.doc_id) ?? 0) + wDesc * h.score);
return [...scores.entries()].sort((a, b) => b[1] - a[1]);
}
for (const [docId, score] of merge(titleHits, descHits).slice(0, 10)) {
console.log(docId, score);
}// Index the row's title and description as two separate FTS docs
db.FTSIndex(ctx, "shop.products", "title", originchain.FTSIndexRequest{
DocID: "p001", Text: "Wireless Headphones - Pro",
})
db.FTSIndex(ctx, "shop.products", "description", originchain.FTSIndexRequest{
DocID: "p001", Text: "Over-ear wireless headphones with noise cancellation.",
})
// Query each field independently
titleHits, _ := db.FTSSearch(ctx, "shop.products", "title", originchain.FTSSearchRequest{
Q: "wireless headphones", Mode: "bm25", K: 20,
})
descHits, _ := db.FTSSearch(ctx, "shop.products", "description", originchain.FTSSearchRequest{
Q: "wireless headphones", Mode: "bm25", K: 20,
})
// Merge client-side with your own weights
const wTitle, wDesc = 2.0, 1.0
scores := map[string]float64{}
for _, h := range titleHits.Hits { scores[h.DocID] += wTitle * h.Score }
for _, h := range descHits.Hits { scores[h.DocID] += wDesc * h.Score } what you get back
// merged, sorted client-side
[
{ "doc_id": "p001", "score": 28.92 }, // strong title + description hit
{ "doc_id": "p027", "score": 14.50 }, // description-only hit
{ "doc_id": "p014", "score": 9.10 }
] how it works
- Each field has its own inverted index and its own document statistics. BM25 on
titleis computed only against other titles - so a match in a 4-word title scores very high. - The two result sets share the
doc_idnamespace because you used the same PK on both POSTs. - You merge by summing weighted scores per
doc_id. Docs that match in both fields naturally float to the top; docs that match in only one are still present but with their single-field score.
common mistakes
- Trying to do the merge server-side. There is no multi-field query mode - merging is your app's responsibility. That's deliberate: it keeps the weights tunable without redeploying.
- Concatenating fields into one big text blob. You lose the field-level statistics that make title boost work in the first place. Keep them separate.
- Using too small a
k. If a doc only matches one field, it must be inside that field's top-kto appear in the merge. Pull a largerkper field than you intend to surface. - Using different
doc_ids across fields. The merge joins ondoc_id. Use the row's primary key on every field's POST or the merge will silently produce nonsense.