OriginChain docs
examples · fts · 6 / 6

6. Multi-field weighted merge

← FTS examples
what this does

Searches the same query across two fields - typically a short, high-signal field like title and a longer body field like description - and produces a single ranked list. Each field is an independent inverted index, so you make two queries and combine the scores in your app with whatever weights you want (e.g. 2.0 × title + 1.0 × description).

when to use it
  • Catalogue / product search where a hit in the title is worth more than a hit in the body.
  • Help-centre search across heading + body.
  • Anywhere you'd reach for "field boosts" in a traditional search engine.
the request

Two POSTs to set up (one per field), two GETs to query, one client-side merge step. The doc_id on both fields is the same row PK so the merge can join on it.

POST /fts/:schema/:field (x2), GET /fts/:schema/:field (x2), then merge
# Step 1: index the same row into two separate fields
curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/title" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "doc_id": "p001", "text": "Wireless Headphones - Pro" }'

curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/description" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ "doc_id": "p001", "text": "Over-ear wireless headphones with noise cancellation." }'

# Step 2: query each field, then merge client-side
curl -G "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/title" \
  -H "Authorization: Bearer $OC_TOKEN" \
  --data-urlencode "q=wireless headphones" --data-urlencode "mode=bm25" --data-urlencode "k=20"

curl -G "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/description" \
  -H "Authorization: Bearer $OC_TOKEN" \
  --data-urlencode "q=wireless headphones" --data-urlencode "mode=bm25" --data-urlencode "k=20"
what you get back
// merged, sorted client-side
[
  { "doc_id": "p001", "score": 28.92 },   // strong title + description hit
  { "doc_id": "p027", "score": 14.50 },   // description-only hit
  { "doc_id": "p014", "score":  9.10 }
]
how it works
  • Each field has its own inverted index and its own document statistics. BM25 on title is computed only against other titles - so a match in a 4-word title scores very high.
  • The two result sets share the doc_id namespace because you used the same PK on both POSTs.
  • You merge by summing weighted scores per doc_id. Docs that match in both fields naturally float to the top; docs that match in only one are still present but with their single-field score.
common mistakes
  • Trying to do the merge server-side. There is no multi-field query mode - merging is your app's responsibility. That's deliberate: it keeps the weights tunable without redeploying.
  • Concatenating fields into one big text blob. You lose the field-level statistics that make title boost work in the first place. Keep them separate.
  • Using too small a k. If a doc only matches one field, it must be inside that field's top-k to appear in the merge. Pull a larger k per field than you intend to surface.
  • Using different doc_ids across fields. The merge joins on doc_id. Use the row's primary key on every field's POST or the merge will silently produce nonsense.