Concepts¶
Core concepts behind HX-SDP's architecture and compression pipeline.
QTT compression¶
HX-SDP stores all data in Quantized Tensor Train (QTT) format. Instead of storing raw arrays, values are decomposed into a chain of small 3D tensors called TT-cores:
$$ A(i_1, i_2, \ldots, i_d) = G_1(i_1) \cdot G_2(i_2) \cdots G_d(i_d) $$
Each core $G_k$ is a matrix of size $r_{k-1} \times 2 \times r_k$, where $r_k$ is the bond dimension (rank) at position $k$.
Why this matters:
- A 1,000,000-element array stored dense = 8 MB (float64)
- The same array in QTT with max rank 16 = ~5 KB — a 1,600× compression ratio
- All operations (similarity, search, reconstruction) work directly on the compressed cores
TT-cores¶
A TT-core is a 3D tensor slice. The full collection of cores for one entry represents the complete data without loss (within the tolerance).
Key properties:
- Bond dimension (rank): Controls compression vs. fidelity trade-off. Higher rank = higher fidelity, more storage.
- Number of cores ($d$): For an array of length $2^d$, there are $d$ cores. A 1024-element array → 10 cores.
- Storage: Total bytes = $\sum_{k=1}^d r_{k-1} \times 2 \times r_k \times 8$ (float64). Scales as $O(d \cdot r^2)$, not $O(2^d)$.
Oracle¶
Every PUT operation passes through the Oracle — a fidelity evaluator that compares the compressed representation against the original:
| Verdict | Meaning |
|---|---|
exact |
Relative error < $10^{-12}$ — effectively lossless |
safe |
Relative error < $10^{-6}$ — suitable for most applications |
weak |
Relative error < $10^{-3}$ — lossy but structure-preserving |
passthrough |
Compression ratio below threshold — stored uncompressed |
The Oracle verdict is returned with every PUT response and stored in metadata.
Namespaces¶
Namespaces provide logical isolation within a single HX-SDP instance:
- Each tenant can access one or more namespaces
- Keys are unique within a namespace
- Cross-namespace queries are not supported (by design — isolation guarantee)
- Default namespace:
default
Versions¶
Every PUT to the same key creates a new version. Versions are immutable — the engine never overwrites data:
sensor_001 v1 → Original reading
sensor_001 v2 → Updated reading (v1 still accessible)
sensor_001 v3 → Latest reading
GET and SERVE return the latest version by default. Pass version=N to retrieve a specific version.
Tenants¶
In multi-tenant deployments (via HX-Gate), each tenant has:
- A unique API key (SHA-256 hashed at rest)
- A namespace ACL controlling which namespaces they can access
- A billing tier with CU quota and rate limits
- An audit trail of all operations
Compute Units (CUs)¶
Every API operation costs a defined number of Compute Units. CUs are metered per tenant per billing period. See Billing & Usage for the complete cost table.
Architecture overview¶
Client → HX-Gate (proxy) → HX-Engine (GPU worker) → QTT Store (disk)
│ │
├─ Auth + ACL ├─ TT-SVD decomposition
├─ Rate limiting ├─ Oracle fidelity check
├─ CU metering ├─ Core-native similarity
└─ Audit logging └─ GPU VRAM serving
- HX-Gate (port 8080): Stateless reverse proxy. Handles authentication, rate limiting, CU metering, and routing. Scales horizontally.
- HX-Engine (port 8000): GPU-accelerated compute worker. Runs TT-SVD, core operations, and PDE solvers. Scales vertically (GPU count).
- Redis (port 6379): Optional shared state for multi-instance Gate deployments.
- QTT Store: On-disk storage of TT-cores. One directory per namespace, one file per key-version.