Storage API¶

Storage endpoints handle ingestion, retrieval, reconstruction, deletion, and listing of compressed data.

PUT — Ingest data¶

Compress and store a tensor or binary blob. The engine runs QTT-SVD decomposition, evaluates fidelity via the Oracle, and persists TT-cores to disk.

POST /v1/put

Request body¶

Field	Type	Required	Description
`key`	string	Yes	Unique key within the namespace
`namespace`	string	No	Target namespace (default: `default`)
`data_b64`	string	Yes	Base64-encoded NumPy array or raw bytes
`metadata`	object	No	Arbitrary JSON metadata attached to the entry
`domain`	string	No	Domain hint for Oracle (default: `structured_data`)

Example¶

curlPython

curl -X POST https://gate.holonomx.com/v1/put \
  -H "X-Api-Key: $HX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "key": "sensor_reading_001",
    "namespace": "telemetry",
    "data_b64": "'$(python3 -c "import numpy as np, base64; print(base64.b64encode(np.random.randn(1024).tobytes()).decode())")'",
    "metadata": {"source": "turbine_7", "unit": "Pa"}
  }'

result = hx.put(
    "sensor_reading_001",
    np.random.randn(1024),
    metadata={"source": "turbine_7", "unit": "Pa"},
    namespace="telemetry",
)
print(result.compression_ratio)  # e.g., 12.5

Response — `200 OK`¶

{
  "key": "sensor_reading_001",
  "namespace": "telemetry",
  "version": 1,
  "compressed": true,
  "verdict": "exact",
  "compression_ratio": 12.5,
  "original_bytes": 8192,
  "stored_core_bytes": 655,
  "n_cores": 10
}

Field	Description
`version`	Auto-incremented version; each PUT to the same key creates a new version
`verdict`	Oracle quality assessment: `exact`, `safe`, `weak`, or `passthrough`
`compressed`	`false` if the data fell below the compression threshold
`compression_ratio`	Original bytes ÷ stored core bytes

CU cost: 1.0

PUT Cores — Direct TT-core upload¶

Upload pre-decomposed TT-cores (skip engine-side SVD). Useful when the client has already performed decomposition.

POST /v1/put_cores

CU cost: 1.0

POST /v1/put_cores_batch

CU cost: 2.0

GET — Retrieve metadata¶

Fetch compression fingerprint and metadata for a stored key. This does not reconstruct the dense array.

GET /v1/get/{namespace}/{key}

Path parameters¶

Parameter	Type	Description
`namespace`	string	Target namespace
`key`	string	Entry key

Query parameters¶

Parameter	Type	Default	Description
`version`	int	latest	Specific version to retrieve

Example¶

curlPython

curl https://gate.holonomx.com/v1/get/telemetry/sensor_reading_001 \
  -H "X-Api-Key: $HX_API_KEY"

info = hx.get("sensor_reading_001", namespace="telemetry")
print(info.n_cores, info.ranks)

Response — `200 OK`¶

{
  "key": "sensor_reading_001",
  "namespace": "telemetry",
  "version": 1,
  "oracle_verdict": "exact",
  "compression_ratio": 12.5,
  "n_cores": 10,
  "ranks": [1, 4, 8, 16, 16, 16, 8, 4, 2, 1],
  "max_rank": 16,
  "total_core_bytes": 655,
  "original_bytes": 8192,
  "metadata": {"source": "turbine_7", "unit": "Pa"},
  "fingerprint": {
    "frobenius_error": 1.2e-14,
    "max_abs_error": 3.1e-15,
    "relative_error": 1.5e-15
  },
  "created_at": 1712345678.123
}

CU cost: 0.1

SERVE — Reconstruct data¶

Decompress and return the full dense array, or load TT-cores directly into GPU VRAM for zero-copy serving.

POST /v1/serve

Request body¶

Field	Type	Required	Description
`key`	string	Yes	Entry key
`namespace`	string	No	Namespace (default: `default`)
`version`	int	No	Specific version (default: latest)
`target`	string	No	`dense` (default) or `gpu`
`device`	string	No	GPU device for `target=gpu` (default: `cuda:0`)

Dense reconstruction¶

curlPython

curl -X POST https://gate.holonomx.com/v1/serve \
  -H "X-Api-Key: $HX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"key": "sensor_reading_001", "namespace": "telemetry"}'

result = hx.serve("sensor_reading_001", namespace="telemetry")
print(result.data.shape)  # (1024,)

CU cost: 0.5

GPU serving¶

Load TT-cores directly into VRAM without dense materialization. The response confirms placement; subsequent operations reference the on-device cores.

gpu_result = hx.serve(
    "sensor_reading_001",
    target="gpu",
    device="cuda:0",
    namespace="telemetry",
)
print(gpu_result.gpu_core_bytes)  # bytes in VRAM

CU cost: 2.0

DELETE — Remove entry¶

Soft-delete an entry. Tombstone is retained for compaction retention period (default 30 days).

DELETE /v1/delete/{namespace}/{key}

CU cost: 0.1

LIST — Enumerate keys¶

List keys in a namespace, optionally filtered by prefix.

GET /v1/list/{namespace}

Query parameters¶

Parameter	Type	Default	Description
`prefix`	string	`""`	Key prefix filter
`limit`	int	1000	Maximum results
`offset`	int	0	Pagination offset

CU cost: 0.1

Storage API¶

PUT — Ingest data¶

Request body¶

Example¶

Response — 200 OK¶

PUT Cores — Direct TT-core upload¶

GET — Retrieve metadata¶

Path parameters¶

Query parameters¶

Example¶

Response — 200 OK¶

SERVE — Reconstruct data¶

Request body¶

Dense reconstruction¶

GPU serving¶

DELETE — Remove entry¶

LIST — Enumerate keys¶

Query parameters¶

Response — `200 OK`¶

Response — `200 OK`¶