Schema for Natural language.
schema · natural language
NL queries compile to the SAME plan tree SQL produces — the executor doesn't know which surface generated it. So the schema requirements mirror SQL: a fully registered table per schema id passed in the request body.
Engine surface: POST /v1/tenants/:t/ask with body { "nl": "…", "schemas": ["…"], "show_plan": true }. The show_plan field returns the compiled Plan tree alongside the rows.
Required schema fields.
Without these, this query surface doesn't function at all.
| field | effect |
|---|---|
| namespace + table + primary_key + [[columns]] (per schema in 'schemas: [...]') | Every schema id passed in the body must be fully registered. The LLM uses these declarations as the prompt context for query compilation. |
Optional fields — what each one unlocks.
Add only the fields whose effect you need. Each one buys a specific capability — speed up a predicate, guard a write, or unlock a new query shape.
| field | type | default | effect |
|---|---|---|---|
| schemas: [...] (body array) | [string] | — | Limits which schemas the LLM considers when compiling the NL. Tighter list = less prompt drift, faster compilation, more predictable plans. |
| [[indexes]] on the underlying schema | object | — | Speeds the compiled SQL — NL doesn't pick indexes itself, but the executor uses them like any other query. |
| [[extractions]] on the underlying schema | object | — | Lets the LLM target derived columns by their declared name in compiled SQL. |
What you can call.
- POST /ask with { nl: "…", schemas: ["…"] }
- Returns either rows directly, or rows + the Plan tree when show_plan: true is set in the body
- Same Plan tree shape as SQL — plan cache is shared
- A managed AI runtime is the LLM behind the compilation today
Abbreviation legend.
| token | meaning |
|---|---|
| NL | Natural language — the human-typed prompt that gets compiled to a SQL plan |
| ASK | The /v1/tenants/:t/ask endpoint family |
| Plan tree | The same oc_query::Plan AST that SQL parses into. Executor doesn't know if it came from SQL or NL |
| schemas: […] | Body array that scopes which schemas the LLM can target. Smaller list = more accurate compilation |
| show_plan: true | Optional body field that returns the Plan tree alongside the rows for debugging / SDK display |
Worked example.
Schema TOML — copy + register via POST /v1/tenants/:t/schemas with Content-Type: text/plain.
# Every schema in the ask request's schemas: [...] array must be
# fully registered. The LLM uses the column declarations as the
# semantic context — it cannot guess columns that aren't declared.
namespace = "shop"
table = "products"
primary_key = ["id"]
[[columns]]
name = "id"
ty = "str"
required = true
[[columns]]
name = "name"
ty = "str"
[[columns]]
name = "category"
ty = "str"
[[columns]]
name = "price_cents"
ty = "i64"
[[columns]]
name = "description"
ty = "str"
[[indexes]]
name = "by_category"
columns = ["category"] Queries it enables.
# Show me all electronics products
curl -X POST $BASE/v1/tenants/$T/ask -H "Authorization: Bearer $BEARER" \
-H "Content-Type: application/json" \
-d '{ "nl": "show me all electronics products", "schemas": ["shop.products"] }'
# Top spenders (compiles to a GROUP BY)
curl -X POST $BASE/v1/tenants/$T/ask -H "Authorization: Bearer $BEARER" \
-d '{ "nl": "which customers spent more than 500 dollars", "schemas": ["shop.orders","shop.customers"] }'
# Return the Plan tree alongside the rows
curl -X POST "$BASE/v1/tenants/$T/ask" -H "Authorization: Bearer $BEARER" \
-d '{ "nl": "list customers from India", "schemas": ["shop.customers"], "show_plan": true }'