Insert data into OriginChain
Every write in OriginChain commits atomically across rows, secondary indexes, vector embeddings, full-text postings, and graph relations on a single WAL frame. Here's how to insert each shape — first independently, then all four together as the differentiator.
Insert a row.
The most common write. Either send SQL or POST a typed JSON body. Both paths hit the same atomic write through the substrate — choose by ergonomics.
SQL
curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/sql" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"sql": "INSERT INTO shop.customers (id, email, region) VALUES ('\''c_1'\'', '\''ada@example.com'\'', '\''IN'\'')"
}' Typed HTTP + Python SDK
curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/rows/shop.customers" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: $(uuidgen)" \
-d '{
"id": "c_1",
"email": "ada@example.com",
"region": "IN"
}'db.rows.put(
"shop.customers",
{ "id": "c_1", "email": "ada@example.com", "region": "IN" },
idempotency_key="customer-c_1-create",
) { "ok": true, "lsn": { "segment": 4, "offset": 8421007 } } The row, every secondary index entry it touches, every vector or full-text extraction declared on the manifest, and every relation it points at land on a single WAL frame. The fsync acks once. On crash either the whole row appears, or none of it. The index can never lag the data. See core concepts → substrate.
Bulk insert.
Three ways to land many rows fast. Pick by source shape: a hand-batched SQL list, a JSON array body, or an NDJSON stream. The 8 MiB body cap is lifted on the streaming batch route — chunk size controls how many rows go in one WAL frame.
Multi-row SQL — up to 1000 rows per call
curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/sql" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"sql": "INSERT INTO shop.customers (id, email, region) VALUES ('\''c_1'\'', '\''ada@example.com'\'', '\''IN'\''), ('\''c_2'\'', '\''hopper@example.com'\'', '\''US'\''), ('\''c_3'\'', '\''lovelace@example.com'\'', '\''GB'\'')"
}' JSON-array batch — atomic across the whole array
curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/rows/shop.customers/_batch" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: bulk-2026-05-02-01" \
-d '[
{ "id": "c_1", "email": "ada@example.com", "region": "IN" },
{ "id": "c_2", "email": "hopper@example.com", "region": "US" },
{ "id": "c_3", "email": "lovelace@example.com", "region": "GB" }
]' NDJSON stream — millions of rows, chunked
# customers.ndjson — one row per line
{"id":"c_1","email":"ada@example.com","region":"IN"}
{"id":"c_2","email":"hopper@example.com","region":"US"}
{"id":"c_3","email":"lovelace@example.com","region":"GB"}
curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/rows/shop.customers/_batch?chunk=1000" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/x-ndjson" \
--data-binary @customers.ndjson Python — chunked generator
def gen_customers():
for i in range(50_000):
yield { "id": f"c_{i}", "email": f"u{i}@example.com", "region": "IN" }
inserted = db.rows.put_batch(
"shop.customers",
gen_customers(),
chunk=1_000, # 1k rows per atomic WAL frame
idempotency_key="bulk-2026-05-02-01",
)
print(f"{inserted} rows accepted") When to switch to streaming inserts. Past a few hundred MB of source data, send NDJSON. Each chunk lands as one atomic WAL frame; partial-failure retries pick up from the chunk boundary the server confirmed. See ops → observability for backfill metrics during heavy ingest.
Insert a vector.
Vectors live in the same instance as rows — different key prefix, same WAL, same backups. Send the embedding plus optional metadata; the HNSW graph segment for that table mutates on the same fsync. Per-table dimensionality is enforced.
curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/vector/shop.products/put" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"id": "sku-9281",
"embedding": [0.0124, -0.0883, 0.0451, /* ... 768 floats ... */],
"dim": 768,
"metric": "cosine",
"metadata": { "category": "running-shoes", "price": 129.0 }
}'db.vector.put(
table="shop.products",
id="sku-9281",
embedding=embedding_768d,
metric="cosine",
metadata={ "category": "running-shoes", "price": 129.0 },
) Inserting a vector during a row write is also supported — declare an extraction on the manifest and the embedding lands on the same WAL frame as the row. See core concepts → atomic writes.
Insert a full-text document.
Index a text field for BM25 retrieval. Re-indexing the same doc_id retires stale postings in the same frame — no ghost matches. Tokenizer and analyzer pipeline are declared on the manifest.
curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/fts/shop.products/description" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"doc_id": "sku-9281",
"text": "Lightweight road runner with a carbon plate, designed for marathon pace."
}'db.fts.index(
table="shop.products",
field="description",
doc_id="sku-9281",
text="Lightweight road runner with a carbon plate, designed for marathon pace.",
)
Tokenizer options (unicode, ascii) and the analyzer pipeline live in the manifest. Browse them on schemas → full-text fields.
Insert a graph relationship.
Edges aren't a separate write. Declare a [[relations]] block on the manifest and the row that holds the foreign-key column emits the forward + reverse edge automatically when you put it. Self-relations work — direction tags resolve the collision. Overwriting the row retires the old edges in the same frame.
curl -X POST "https://acme.ap-south-1.db.originchain.ai/v1/tenants/$T/rows/shop.products" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"id": "sku-9281",
"name": "Carbon Marathon",
"supplier_id": "sup-44",
"price": 129.0
}'db.rows.put(
"shop.products",
{
"id": "sku-9281",
"name": "Carbon Marathon",
"supplier_id": "sup-44", # declared as a relation in the manifest
"price": 129.0,
},
)
With a relation declared on supplier_id, the same row write lands the row plus a forward edge under rel|fwd|shop.products|supplied_by|sku-9281|sup-44 and the reverse edge under rel|rev|shop.suppliers|supplied_by|sup-44|sku-9281. Walk them with neighbors / BFS / Dijkstra.
Atomic multi-shape insert.
The differentiator. A single product enters the catalog as a row, a 768-dimensional embedding, a BM25-indexed description, and a supplier relationship. One database, one bearer, one endpoint — no ETL between Postgres, Pinecone, Elasticsearch, and Neo4j. Declare the shape once on the manifest; every write that touches the row keeps all four projections in lockstep.
The manifest
# manifest.toml — one product. Row, vector, full-text, graph all declared once.
id = "shop.products"
[primary_key]
columns = ["id"]
[[columns]]
name = "id" ; type = "str"
[[columns]]
name = "name" ; type = "str"
[[columns]]
name = "supplier_id" ; type = "str"
[[columns]]
name = "price" ; type = "f64"
[[columns]]
name = "description" ; type = "str"
[[indexes]]
columns = ["supplier_id"]
[[relations]]
name = "supplied_by"
column = "supplier_id"
target = "shop.suppliers"
[[extractions.fts]]
field = "description"
analyzer = ["lowercase", "stem:english"]
[[extractions.vector]]
field = "description"
dim = 768
metric = "cosine" Ingest a product
# One product → row + 768d embedding + BM25 description + supplier edge.
# Each call lands in one WAL frame; ordered against the same instance.
product_id = "sku-9281"
description = "Lightweight road runner with a carbon plate, designed for marathon pace."
# 1. Row + the supplier graph edge (declared as a relation on the manifest)
db.rows.put("shop.products", {
"id": product_id,
"name": "Carbon Marathon",
"supplier_id": "sup-44",
"price": 129.0,
"description": description,
})
# 2. Vector embedding of the description
db.vector.put(
table="shop.products",
id=product_id,
embedding=embed(description), # 768d f32
metric="cosine",
metadata={ "category": "running-shoes", "price": 129.0 },
)
# 3. BM25 inverted index on description
db.fts.index(
table="shop.products",
field="description",
doc_id=product_id,
text=description,
)
The same product is now visible to a SQL SELECT, a vector topk, a BM25 search, and a graph neighbors walk — the moment the WAL fsyncs. No reconciliation jobs, no eventual consistency, no second-system staleness.