Schema for Natural language.

schema · natural language

NL queries compile to the SAME plan tree SQL produces — the executor doesn't know which surface generated it. So the schema requirements mirror SQL: a fully registered table per schema id passed in the request body.

Engine surface: POST /v1/tenants/:t/ask with body { "nl": "…", "schemas": ["…"], "show_plan": true }. The show_plan field returns the compiled Plan tree alongside the rows.

Required schema fields.

Without these, this query surface doesn't function at all.

field	effect
namespace + table + primary_key + [[columns]] (per schema in 'schemas: [...]')	Every schema id passed in the body must be fully registered. The LLM uses these declarations as the prompt context for query compilation.

Optional fields — what each one unlocks.

Add only the fields whose effect you need. Each one buys a specific capability — speed up a predicate, guard a write, or unlock a new query shape.

field	type	default	effect
schemas: [...] (body array)	[string]	—	Limits which schemas the LLM considers when compiling the NL. Tighter list = less prompt drift, faster compilation, more predictable plans.
[[indexes]] on the underlying schema	object	—	Speeds the compiled SQL — NL doesn't pick indexes itself, but the executor uses them like any other query.
[[extractions]] on the underlying schema	object	—	Lets the LLM target derived columns by their declared name in compiled SQL.

What you can call.

POST /ask with { nl: "…", schemas: ["…"] }
Returns either rows directly, or rows + the Plan tree when show_plan: true is set in the body
Same Plan tree shape as SQL — plan cache is shared
A managed AI runtime is the LLM behind the compilation today

Abbreviation legend.

token	meaning
NL	Natural language — the human-typed prompt that gets compiled to a SQL plan
ASK	The /v1/tenants/:t/ask endpoint family
Plan tree	The same oc_query::Plan AST that SQL parses into. Executor doesn't know if it came from SQL or NL
schemas: […]	Body array that scopes which schemas the LLM can target. Smaller list = more accurate compilation
show_plan: true	Optional body field that returns the Plan tree alongside the rows for debugging / SDK display

Worked example.

Schema TOML — copy + register via POST /v1/tenants/:t/schemas with Content-Type: text/plain.

# Every schema in the ask request's schemas: [...] array must be
# fully registered. The LLM uses the column declarations as the
# semantic context — it cannot guess columns that aren't declared.

namespace   = "shop"
table       = "products"
primary_key = ["id"]

[[columns]]
name = "id"          
ty = "str"  
required = true
[[columns]]
name = "name"        
ty = "str"
[[columns]]
name = "category"    
ty = "str"
[[columns]]
name = "price_cents" 
ty = "i64"
[[columns]]
name = "description" 
ty = "str"

[[indexes]]
name    = "by_category"
columns = ["category"]

Queries it enables.

# Show me all electronics products
curl -X POST $BASE/v1/tenants/$T/ask -H "Authorization: Bearer $BEARER" \
  -H "Content-Type: application/json" \
  -d '{ "nl": "show me all electronics products", "schemas": ["shop.products"] }'

# Top spenders (compiles to a GROUP BY)
curl -X POST $BASE/v1/tenants/$T/ask -H "Authorization: Bearer $BEARER" \
  -d '{ "nl": "which customers spent more than 500 dollars", "schemas": ["shop.orders","shop.customers"] }'

# Return the Plan tree alongside the rows
curl -X POST "$BASE/v1/tenants/$T/ask" -H "Authorization: Bearer $BEARER" \
  -d '{ "nl": "list customers from India", "schemas": ["shop.customers"], "show_plan": true }'