examples · fts · 6 / 6

6. Multi-field weighted merge

what this does

Searches the same query across two fields - typically a short, high-signal field like title and a longer body field like description - and produces a single ranked list. Each field is an independent inverted index, so you make two queries and combine the scores in your app with whatever weights you want (e.g. 2.0 × title + 1.0 × description).

when to use it

Catalogue / product search where a hit in the title is worth more than a hit in the body.
Help-centre search across heading + body.
Anywhere you'd reach for "field boosts" in a traditional search engine.

the request

Two POSTs to set up (one per field), two GETs to query, one client-side merge step. The doc_id on both fields is the same row PK so the merge can join on it.

POST /fts/:schema/:field (x2), GET /fts/:schema/:field (x2), then merge

# Step 1: index the same row into two separate fields
curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/title" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "doc_id": "p001", "text": "Wireless Headphones - Pro" }'

curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/description" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "doc_id": "p001", "text": "Over-ear wireless headphones with noise cancellation." }'

# Step 2: query each field, then merge client-side
curl -G "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/title" \
  -H "Authorization: Bearer $OC_TOKEN" \
  --data-urlencode "q=wireless headphones" --data-urlencode "mode=bm25" --data-urlencode "k=20"

curl -G "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/description" \
  -H "Authorization: Bearer $OC_TOKEN" \
  --data-urlencode "q=wireless headphones" --data-urlencode "mode=bm25" --data-urlencode "k=20"

# Index the row's title and description as two separate FTS docs
db.fts.index("shop.products", "title",       doc_id="p001", text="Wireless Headphones - Pro")
db.fts.index("shop.products", "description", doc_id="p001", text="Over-ear wireless headphones with noise cancellation.")

# Query each field independently
title_hits = db.fts.search("shop.products", "title",       q="wireless headphones", mode="bm25", k=20)
desc_hits  = db.fts.search("shop.products", "description", q="wireless headphones", mode="bm25", k=20)

# Merge client-side with your own weights
def merge(title_hits, desc_hits, w_title=2.0, w_desc=1.0):
    scores: dict[str, float] = {}
    for h in title_hits.hits:
        scores[h["doc_id"]] = scores.get(h["doc_id"], 0.0) + w_title * h["score"]
    for h in desc_hits.hits:
        scores[h["doc_id"]] = scores.get(h["doc_id"], 0.0) + w_desc  * h["score"]
    return sorted(scores.items(), key=lambda kv: kv[1], reverse=True)

for doc_id, score in merge(title_hits, desc_hits)[:10]:
    print(doc_id, score)

// Index the row's title and description as two separate FTS docs
await db.ftsIndex("shop.products", "title",       { doc_id: "p001", text: "Wireless Headphones - Pro" });
await db.ftsIndex("shop.products", "description", { doc_id: "p001", text: "Over-ear wireless headphones with noise cancellation." });

// Query each field independently
const titleHits = await db.ftsSearch("shop.products", "title",       { q: "wireless headphones", mode: "bm25", k: 20 });
const descHits  = await db.ftsSearch("shop.products", "description", { q: "wireless headphones", mode: "bm25", k: 20 });

// Merge client-side with your own weights
function merge(
  title: typeof titleHits,
  desc:  typeof descHits,
  wTitle = 2.0,
  wDesc  = 1.0,
) {
  const scores = new Map<string, number>();
  for (const h of title.hits) scores.set(h.doc_id, (scores.get(h.doc_id) ?? 0) + wTitle * h.score);
  for (const h of desc.hits)  scores.set(h.doc_id, (scores.get(h.doc_id) ?? 0) + wDesc  * h.score);
  return [...scores.entries()].sort((a, b) => b[1] - a[1]);
}

for (const [docId, score] of merge(titleHits, descHits).slice(0, 10)) {
  console.log(docId, score);
}

// Index the row's title and description as two separate FTS docs
db.FTSIndex(ctx, "shop.products", "title", originchain.FTSIndexRequest{
    DocID: "p001", Text: "Wireless Headphones - Pro",
})
db.FTSIndex(ctx, "shop.products", "description", originchain.FTSIndexRequest{
    DocID: "p001", Text: "Over-ear wireless headphones with noise cancellation.",
})

// Query each field independently
titleHits, _ := db.FTSSearch(ctx, "shop.products", "title", originchain.FTSSearchRequest{
    Q: "wireless headphones", Mode: "bm25", K: 20,
})
descHits, _ := db.FTSSearch(ctx, "shop.products", "description", originchain.FTSSearchRequest{
    Q: "wireless headphones", Mode: "bm25", K: 20,
})

// Merge client-side with your own weights
const wTitle, wDesc = 2.0, 1.0
scores := map[string]float64{}
for _, h := range titleHits.Hits { scores[h.DocID] += wTitle * h.Score }
for _, h := range descHits.Hits  { scores[h.DocID] += wDesc  * h.Score }

what you get back

// merged, sorted client-side
[
  { "doc_id": "p001", "score": 28.92 },   // strong title + description hit
  { "doc_id": "p027", "score": 14.50 },   // description-only hit
  { "doc_id": "p014", "score":  9.10 }
]

how it works

Each field has its own inverted index and its own document statistics. BM25 on title is computed only against other titles - so a match in a 4-word title scores very high.
The two result sets share the doc_id namespace because you used the same PK on both POSTs.
You merge by summing weighted scores per doc_id. Docs that match in both fields naturally float to the top; docs that match in only one are still present but with their single-field score.

common mistakes

Trying to do the merge server-side. There is no multi-field query mode - merging is your app's responsibility. That's deliberate: it keeps the weights tunable without redeploying.
Concatenating fields into one big text blob. You lose the field-level statistics that make title boost work in the first place. Keep them separate.
Using too small a k. If a doc only matches one field, it must be inside that field's top-k to appear in the merge. Pull a larger k per field than you intend to surface.
Using different doc_ids across fields. The merge joins on doc_id. Use the row's primary key on every field's POST or the merge will silently produce nonsense.