Skip to content

Billing & Usage

HX-SDP meters all API operations in Compute Units (CUs). Each tenant has a monthly CU quota determined by their billing tier.


CU cost table

Operation Endpoint CU cost Description
put POST /v1/put 1.0 Ingest + QTT-SVD compression
put_cores POST /v1/put_cores 1.0 Direct TT-core upload
put_cores_batch POST /v1/put_cores_batch 2.0 Batch TT-core upload
get GET /v1/get/{ns}/{key} 0.1 Metadata retrieval (no decompress)
serve POST /v1/serve (dense) 0.5 Dense array reconstruction
serve_gpu POST /v1/serve (gpu) 2.0 Load TT-cores to GPU VRAM
search POST /v1/search 0.5 Metadata-filtered search
delete DELETE /v1/delete/{ns}/{key} 0.1 Soft-delete entry
list GET /v1/list/{ns} 0.1 List keys in namespace
query_similarity POST /v1/query/similarity 1.0 Pairwise core inner product
query_topk POST /v1/query/topk 1.0 Top-K similarity search
query_vector POST /v1/query/vector 1.0 External vector top-K search

The X-HX-CUs response header reports the cost charged for each request.


Billing tiers

Managed (SaaS)

Tier Price Monthly CUs Rate limit Overage
Starter Free 1,000 10 req/min
Builder $49/mo 50,000 100 req/60s $0.05/CU
Pro $299/mo 500,000 1,000 req/60s $0.03/CU
Enterprise Custom Custom Custom Custom

Self-hosted

Self-hosted deployments have no CU limits by default. Enable metering via HX_GATE_BILLING_ENABLED=true if you want to track usage across internal tenants.


Checking usage

Via API

# All tenants (admin)
curl https://gate.holonomx.com/gate/admin/usage \
  -H "Authorization: Bearer $SERVICE_KEY"

# Single tenant (admin)
curl https://gate.holonomx.com/gate/admin/usage/acme-corp \
  -H "Authorization: Bearer $SERVICE_KEY"

Via Console

The Console dashboard shows a real-time usage bar for the authenticated tenant.

Response example

{
  "tenant_id": "acme-corp",
  "period": "2026-01",
  "cus_used": 12450.5,
  "monthly_quota": 500000,
  "utilization": 0.0249,
  "breakdown": {
    "put": 5000.0,
    "get": 250.0,
    "query_topk": 3200.0,
    "serve": 2000.0,
    "search": 1500.0,
    "delete": 500.5
  }
}

Quota enforcement

When a tenant exceeds their monthly CU quota:

  1. Requests return 503 Service Unavailable with:
    {"detail": "Monthly CU quota exhausted for tenant 'acme-corp'"}
    
  2. The tenant can upgrade their tier immediately via the Console or API.
  3. Enterprise tenants can configure soft limits with overage billing.

Stripe integration

Managed deployments use Stripe for subscription management:

  • Checkout: POST /gate/checkout creates a Stripe Checkout session.
  • Provisioning: On successful payment, a webhook (POST /webhooks/stripe) automatically provisions the tenant with the corresponding tier.
  • Upgrades/downgrades: Use POST /gate/onboard/update-tier or let the customer manage via Stripe Customer Portal.

Required environment variables:

Variable Description
HX_GATE_STRIPE_SECRET_KEY Stripe secret key (sk_live_... or sk_test_...)
HX_GATE_STRIPE_PUBLISHABLE_KEY Stripe publishable key (pk_live_...)
HX_GATE_STRIPE_WEBHOOK_SECRET Webhook signing secret (whsec_...)
HX_GATE_STRIPE_PRICE_BUILDER Stripe Price ID for Builder tier
HX_GATE_STRIPE_PRICE_PRO Stripe Price ID for Pro tier
HX_GATE_BILLING_ENABLED Set to true to enforce billing

Billing period

  • CU counters reset on the 1st of each calendar month (UTC).
  • Overage charges are calculated at the end of the billing period.
  • Unused CUs do not roll over.