Skip to main content
A workload is a unit of compute you run on the satellite fleet. There is one submission surfacePOST /v1/workloads — and the workload’s class decides how it executes:
parallelism.classRuns as
single (default)one satellite (the CAE places it)
data-parallelsharded across satellites, recombined on the ground
spatialeach satellite processes its footprint of an area-of-interest
federateda global model trained from on-board data (only deltas leave orbit)
model-parallela pipeline split across satellites — activations flow stage → stage
split-learninga head on the data-owning satellite, the heavy tail offloaded
Base URL: https://api.rotastellar.com/v1 · Auth: Authorization: Bearer rs_...

Lifecycle

preview (optional) → submit → poll GET /{id} → inspect stages / artifactscancel (any time).

Preview — the plan

POST /v1/workloads/preview returns the CAE’s verdict — feasibility, the model cut/partition, whether it’s ISL-bound — and a cost estimate, without running anything (read-only; no quota, no billing). It’s the “plan” to submit’s “apply”.
curl -X POST https://api.rotastellar.com/v1/workloads/preview \
  -H "Authorization: Bearer $RS_API_KEY" -H "content-type: application/json" -d '{
    "placement": { "satellites": [25544, 48274, 53807] },
    "parallelism": { "class": "model-parallel", "stages": 3 },
    "model": { "layers": [{"name":"embed","compute_fwd":2,"activation_mb":4}] }
  }'
{
  "kind": "model_parallel",
  "candidates": 3,
  "plan": { "feasible": true, "stages": 3, "partition": [/* ... */], "isl_bound": false },
  "cost_estimate": { "ocu": 450, "cents": 5400, "per_stage_ocu": 150, "stages": 3, "price_per_ocu_cents": 12 }
}
distributed-training is ISL-bound (gradient AllReduce over sparse inter-satellite links), so it is preview-only — preview it to see the honest verdict; it is not executable.

Submit — run

POST /v1/workloads takes the same body and executes it. Returns the id + a status_url.
{ "id": "wl_9a0b…", "status": "scheduling", "status_url": "/v1/workloads/wl_9a0b…" }
A frontier workload (model-parallel / split-learning) runs as a pipeline across the constellation — the CAE places each stage on a satellite, and the stages run in sequence as a coordinated pipeline.

Track status

Poll GET /v1/workloads/{id} and read phase: submitted → scheduling → dispatching → awaiting_report → done (or failed / timed_out / canceled). workflow_status is the coarse pill (RUNNING / COMPLETED / FAILED / …). The response also carries placement, the structured outcome, the result manifest, and cost once billable.

Stages — per-satellite execution

GET /v1/workloads/{id}/stages — for a frontier / scatter-gather workload, which satellite ran each stage and its status.
{ "id": "wl_9a0b…", "stages": [
  { "index": 0, "satellite": "48274", "satellite_name": "CSS (TIANHE)", "status": "done", "role": "stage" },
  { "index": 1, "satellite": "25544", "satellite_name": "ISS (ZARYA)",  "status": "done", "role": "stage" }
] }

Results & cancel

  • GET /v1/workloads/{id}/artifacts — the result manifest (the outputs + how to fetch each: custodied contentId or external uri).
  • POST /v1/workloads/{id}/cancel — stop a queued or running workload; its queued / in-flight stages are dropped.

SDKs & CLI

Every operation is in the SDKs and the CLI:
rs.workloads.preview(placement={...}, parallelism={"class": "model-parallel", "stages": 3}, model={...})
wl = rs.workloads.submit(type="fine-tune", placement={...}, parallelism={...}, model={...})
rs.workloads.get(wl["id"]); rs.workloads.stages(wl["id"]); rs.workloads.cancel(wl["id"])
rotastellar jobs preview --file workload.json
rotastellar jobs submit  --file workload.json
rotastellar jobs get <id>;  rotastellar jobs stages <id>;  rotastellar jobs cancel <id>