OriginChain
roadmap

Depth, not breadth. On purpose.

Six months to 1.0. Every milestone is a measured outcome, not a marketing claim. Here's what's shipped, what's in flight, what's next, and what we have explicitly deferred.

status legend
done in flight next deferred
01 · shipped

In production today. Running on every managed instance.

Each item below is live in production — running against real customer workloads, observable in /metrics, and covered by the same recovery and replication paths as everything else. The foundation that the rest of the roadmap builds on.

done
Substrate v1

Hash-keyed key-value store with a single write-ahead log. Atomic multi-shape writes — rows, secondary indexes, embeddings, BM25 postings, and graph edges all land on one frame. The foundational layer everything else compiles to.

done
SQL with JOINs, aggregates, and CTEs

Typed, planner-driven SQL — no string-pasting, no client-side composition. INNER, LEFT, RIGHT, and FULL OUTER joins; GROUP BY with COUNT / SUM / AVG / MIN / MAX; HAVING filters; CTEs.

done
HNSW vector search with f32 SIMD distance kernels

Two operating modes on 100k vectors: fast mode at p99 ~37 ms with recall@10 = 0.69, and high_recall mode at p99 ~109 ms with recall@10 = 0.96. Cosine, dot, and L2 are SIMD primitives over a deserialised graph cache.

done
BM25 full-text search

Lucene-default BM25 ranking, phrase queries via position-list intersection, UAX #29 Unicode tokenizer, and Snowball stemming for 18 languages.

done
Graph traversal

BFS up to a configurable max depth, weighted shortest path (Dijkstra) with caller-supplied weight functions, and reverse traversal that stays correct on self-relations.

done
Natural-language /v1/ask

Rule compiler with LLM fallback, plan-cached so the LLM is touched once per query shape. Cached /ask responses return without any model invocation; the executor runs the plan, not the model.

done
Active-passive replication, RPO = 0

Sync replication on every paid tier — no acknowledged write is ever lost on writer failure. Sealed-segment archive plus continuous tail-shipper enables sub-second point-in-time recovery.

done
Crash testing harness

Panic injection at four WAL boundaries with the recovery invariant verified each run: recovered state equals some prefix of the op stream. Determinism is checked, not assumed.

done
2026-04-30
HA snapshot bootstrap

Verified end-to-end on live infrastructure (drill v2 PASSED). Failover preserves full state via Frame::Snapshot transfer rather than replaying the entire log.

done
Single-tenant managed cloud

Dedicated host per tenant, region-isolated. Tenancy is physical, not logical — no shared load balancer, no shared disk, no shared memory between customers.

done
OpenAPI 3.1 spec + SDKs

Full OpenAPI 3.1 description served at /openapi.json. First-party SDKs in TypeScript, Python, and Go — all typed against the same spec.

done
MCP server

@originchain/mcp-server publishes the database surface as Model Context Protocol tools. Drop-in integration with Claude Desktop, Cursor, and any MCP-aware client.

done
Per-account Razorpay billing

USD pricing, monthly and annual cycles, prorated mid-cycle changes. Includes a 7-day Whisper-tier trial (one per account) and the full dunning workflow.

02 · in flight

Decisions locked. Implementation in progress.

Each spec below has six locked design decisions written down before code. We post the decisions, not just the headlines, so you can read the trade-offs we made and predict how the feature will behave when it lands.

in flight
Spec 06
Online schema migrations

Six locked decisions covering how schemas evolve without downtime — version numbering, allowed shape transitions, backfill pacing, read-path compatibility, and the cutover protocol.

locked decisions
  • monotonic version int
  • four allowed shapes
  • 10% backfill rate
  • dual-read transform
  • atomic cutover
  • abort-only-pre-cutover
in flight
Spec 05
EXPLAIN endpoint

First-class plan introspection across every query shape — SQL, vector, BM25, graph, natural language. Same auth path, same plan tree, same cost model.

locked decisions
  • reads cache
  • ?explain=true param
  • per-id kill only
  • clean flush
  • OTLP push
  • tail-based sampling
in flight
Spec 03
Per-API-key rate limits + quota

Sequenced cost accounting per API key, calibrated against the engine's own cost model. The same SIMD kernels that run filters also weigh them.

locked decisions
  • sequencing
  • engine cost-model
  • own SIMD
  • per-API-key limits
  • mutex stager
  • eager backfill
in flight
Spec 04
Single-writer WAL byte-stream replication

The shape of the commit window, the buckets the rate limiter uses, and the replication transport that makes it all observable to followers.

locked decisions
  • 100 µs / 256-writer commit window
  • last-writer-wins
  • 4-dim buckets
  • per-key config
  • 429 + Retry-After
  • single-writer WAL byte stream
03 · next six months

Six steps to 1.0. In sequence, not in parallel.

The path to 1.0 is depth-first. We finish one step before we open the next, because each milestone changes the assumptions the following one depends on. No vector breadth-features, no graph algorithms, no full-text rewrites — those come after 1.0.

01 next
HA verification at scale

Multi-region failover drills and longer-duration replicas. The 2026-04-30 drill verified the protocol; the next phase is verifying it under sustained load and across more topologies.

02 next
Fuzzing campaign

Randomised op streams, schema-shape generators, and multi-process race testing. The crash-injection harness finds torn frames; the fuzzer's job is to find everything else.

03 next
Cost-based optimiser

Replace the heuristic plan choice with a calibrated cost model. Per-segment histograms already exist; the optimiser is the layer that turns them into a ranking instead of a hint.

04 next
EXPLAIN as a first-class debugging surface

Plan trees with per-node estimated rows, actual rows, time, chunks read, and segments pruned — exposed to customers and to support engineers, not just to internal tooling.

05 next
Multi-writer Raft

Flips the substrate from single-writer to multi-writer-with-Raft-consensus. The single-writer model is what makes today's atomic multi-shape commit cheap; multi-writer is what makes it scale horizontally.

06 next
Online schema migrations GA

Atomic cutover, dual-read transform, abort-only-pre-cutover. Spec 06 lands as a customer-facing capability with all six locked decisions in production.

Sequence matters. HA verification feeds the fuzzer's model of what "correct" means; the fuzzer feeds the optimiser's confidence that its plan choices are safe; the optimiser is what makes EXPLAIN useful; multi-writer Raft is the substrate change online migrations sit on top of.

04 · beyond 1.0

Deliberately deferred. Not before 1.0.

The roadmap is what we will not ship as much as what we will. Items below are real and on the long-range plan — but every one of them depends on a 1.0 primitive that has not landed yet, so shipping them earlier would mean rebuilding them later.

deferred
IVF-PQ vector index

For tables beyond 100M vectors. HNSW carries today's workloads; IVF-PQ is the path when index size becomes the dominant cost.

deferred
Distributed graph algorithms

Graph operations are single-shard today. Distribution comes after multi-writer Raft, because it depends on the same consensus primitive.

deferred
Self-host and open-source

Managed-cloud-only is by design — see /pricing. The substrate's recovery, replication, and snapshot guarantees depend on the operational envelope we control.

deferred
Integrations marketplace

LangChain, LlamaIndex, and friends. Coming after MCP gets traction — the MCP server is the first integration, and the surface area informs the rest.

Want to see the substrate the roadmap is built on?

The architecture page walks through the hash-keyed substrate, the plan tree, the ingest path, replication, and recovery — the foundations every roadmap milestone depends on. The docs are where you turn it into a working integration.