roadmap

Depth, not breadth. On purpose.

Six months to 1.0. Every milestone is a measured outcome, not a marketing claim. Here's what's shipped, what's in flight, what's next, and what we have explicitly deferred.

status legend

done in flight next deferred

01 · shipped

In production today. Running on every managed instance.

Each item below is live in production — running against real customer workloads, observable in /metrics, and covered by the same recovery and replication paths as everything else. The foundation that the rest of the roadmap builds on.

done

Substrate v1

Hash-keyed key-value store with a single write-ahead log. Atomic multi-shape writes — rows, secondary indexes, embeddings, BM25 postings, and graph edges all land on one frame. The foundational layer everything else compiles to.

done

SQL with JOINs, aggregates, and CTEs

Typed, planner-driven SQL — no string-pasting, no client-side composition. INNER, LEFT, RIGHT, and FULL OUTER joins; GROUP BY with COUNT / SUM / AVG / MIN / MAX; HAVING filters; CTEs.

done

HNSW vector search with f32 SIMD distance kernels

Two operating modes on 100k vectors: fast mode at p99 ~37 ms with recall@10 = 0.69, and high_recall mode at p99 ~109 ms with recall@10 = 0.96. Cosine, dot, and L2 are SIMD primitives over a deserialised graph cache.

done

BM25 full-text search

Lucene-default BM25 ranking, phrase queries via position-list intersection, UAX #29 Unicode tokenizer, and Snowball stemming for 18 languages.

done

Graph traversal

BFS up to a configurable max depth, weighted shortest path (Dijkstra) with caller-supplied weight functions, and reverse traversal that stays correct on self-relations.

done

Natural-language /v1/ask

Rule compiler with LLM fallback, plan-cached so the LLM is touched once per query shape. Cached /ask responses return without any model invocation; the executor runs the plan, not the model.

done

Active-passive replication, RPO = 0

Sync replication on every paid tier — no acknowledged write is ever lost on writer failure. Sealed-segment archive plus continuous tail-shipper enables sub-second point-in-time recovery.

done

Crash testing harness

Panic injection at four WAL boundaries with the recovery invariant verified each run: recovered state equals some prefix of the op stream. Determinism is checked, not assumed.

done

2026-04-30

HA snapshot bootstrap

Verified end-to-end on live infrastructure (drill v2 PASSED). Failover preserves full state via Frame::Snapshot transfer rather than replaying the entire log.

done

Single-tenant managed cloud

Dedicated host per tenant, region-isolated. Tenancy is physical, not logical — no shared load balancer, no shared disk, no shared memory between customers.

done

OpenAPI 3.1 spec + SDKs

Full OpenAPI 3.1 description served at /openapi.json. First-party SDKs in TypeScript, Python, and Go — all typed against the same spec.

done

MCP server

@originchain/mcp-server publishes the database surface as Model Context Protocol tools. Drop-in integration with Claude Desktop, Cursor, and any MCP-aware client.

done

Per-account Razorpay billing

USD pricing, monthly and annual cycles, prorated mid-cycle changes. Includes a 7-day Whisper-tier trial (one per account) and the full dunning workflow.

02 · in flight

Decisions locked. Implementation in progress.

Each spec below has six locked design decisions written down before code. We post the decisions, not just the headlines, so you can read the trade-offs we made and predict how the feature will behave when it lands.

in flight

Spec 06

Online schema migrations

Six locked decisions covering how schemas evolve without downtime — version numbering, allowed shape transitions, backfill pacing, read-path compatibility, and the cutover protocol.

locked decisions

monotonic version int
four allowed shapes
10% backfill rate
dual-read transform
atomic cutover
abort-only-pre-cutover

in flight

Spec 05

EXPLAIN endpoint

First-class plan introspection across every query shape — SQL, vector, BM25, graph, natural language. Same auth path, same plan tree, same cost model.

locked decisions

reads cache
?explain=true param
per-id kill only
clean flush
OTLP push
tail-based sampling

in flight

Spec 03

Per-API-key rate limits + quota

Sequenced cost accounting per API key, calibrated against the engine's own cost model. The same SIMD kernels that run filters also weigh them.

locked decisions

sequencing
engine cost-model
own SIMD
per-API-key limits
mutex stager
eager backfill

in flight

Spec 04

Single-writer WAL byte-stream replication

The shape of the commit window, the buckets the rate limiter uses, and the replication transport that makes it all observable to followers.

locked decisions

100 µs / 256-writer commit window
last-writer-wins
4-dim buckets
per-key config
429 + Retry-After
single-writer WAL byte stream

03 · next six months

Six steps to 1.0. In sequence, not in parallel.

The path to 1.0 is depth-first. We finish one step before we open the next, because each milestone changes the assumptions the following one depends on. No vector breadth-features, no graph algorithms, no full-text rewrites — those come after 1.0.

01 next

HA verification at scale

Multi-region failover drills and longer-duration replicas. The 2026-04-30 drill verified the protocol; the next phase is verifying it under sustained load and across more topologies.

02 next

Fuzzing campaign

Randomised op streams, schema-shape generators, and multi-process race testing. The crash-injection harness finds torn frames; the fuzzer's job is to find everything else.

03 next

Cost-based optimiser

Replace the heuristic plan choice with a calibrated cost model. Per-segment histograms already exist; the optimiser is the layer that turns them into a ranking instead of a hint.

04 next

EXPLAIN as a first-class debugging surface

Plan trees with per-node estimated rows, actual rows, time, chunks read, and segments pruned — exposed to customers and to support engineers, not just to internal tooling.

05 next

Multi-writer Raft

Flips the substrate from single-writer to multi-writer-with-Raft-consensus. The single-writer model is what makes today's atomic multi-shape commit cheap; multi-writer is what makes it scale horizontally.

06 next

Online schema migrations GA

Atomic cutover, dual-read transform, abort-only-pre-cutover. Spec 06 lands as a customer-facing capability with all six locked decisions in production.

Sequence matters. HA verification feeds the fuzzer's model of what "correct" means; the fuzzer feeds the optimiser's confidence that its plan choices are safe; the optimiser is what makes EXPLAIN useful; multi-writer Raft is the substrate change online migrations sit on top of.

04 · beyond 1.0

Deliberately deferred. Not before 1.0.

The roadmap is what we will not ship as much as what we will. Items below are real and on the long-range plan — but every one of them depends on a 1.0 primitive that has not landed yet, so shipping them earlier would mean rebuilding them later.

deferred

IVF-PQ vector index

For tables beyond 100M vectors. HNSW carries today's workloads; IVF-PQ is the path when index size becomes the dominant cost.

deferred

Distributed graph algorithms

Graph operations are single-shard today. Distribution comes after multi-writer Raft, because it depends on the same consensus primitive.

deferred

Self-host and open-source

Managed-cloud-only is by design — see /pricing. The substrate's recovery, replication, and snapshot guarantees depend on the operational envelope we control.

deferred

Integrations marketplace

LangChain, LlamaIndex, and friends. Coming after MCP gets traction — the MCP server is the first integration, and the surface area informs the rest.

Want to see the substrate the roadmap is built on?

The architecture page walks through the hash-keyed substrate, the plan tree, the ingest path, replication, and recovery — the foundations every roadmap milestone depends on. The docs are where you turn it into a working integration.

Read the architecture Try it in the docs Release notes