Federated Learning
Overview
Train machine learning models across distributed Earth and orbital infrastructure. Each node trains locally on its data, then synchronizes compressed gradients during ground station passes.Key Components
FederatedClient
Local training client for Earth or orbital nodes
GradientAggregator
Central coordinator for gradient synchronization
CompressionConfig
Gradient compression settings (TopK + quantization)
Error Feedback
Lossless compression via error accumulation
Gradient Compression
Bandwidth between orbital and ground nodes is extremely limited. Raw gradient synchronization is infeasible for large models. Our compression pipeline achieves 100x reduction with minimal accuracy loss:Compression Pipeline
1
Original Gradients (4.2 MB)
Raw gradient tensor from backpropagation, e.g.,
∇ = [0.12, -0.08, 0.003, ...]2
TopK Sparsification (42 KB)
Keep only top 1% of gradients by magnitude. Reduces size by 100x while preserving the most important updates.
3
8-bit Stochastic Quantization (10.5 KB)
Convert Float32 to Int8 with scale factor. Further 4x reduction with minimal precision loss.
4
Error Feedback
Accumulate dropped gradients for the next round. Guarantees eventual convergence despite aggressive compression.
Configuration
Federated Client
TheFederatedClient runs on each participating node (Earth or orbital):
Gradient Aggregator
TheGradientAggregator runs on a ground station or cloud, coordinating updates from all nodes:
Aggregation Strategies
| Strategy | Description | Best For |
|---|---|---|
sync_fedavg | Wait for all nodes before aggregating | Reliable connectivity |
async_fedavg | Aggregate as updates arrive | Intermittent connectivity |
weighted_fedavg | Weight by dataset size | Heterogeneous data |
momentum_fedavg | Add momentum to updates | Faster convergence |
Handling Connectivity
Orbital nodes experience intermittent connectivity. The client handles this automatically:Convergence Guarantees
Despite compression and async updates, training converges to the same solution as centralized training:| Property | Guarantee |
|---|---|
| Compression loss | Under 0.5% final accuracy vs uncompressed |
| Staleness impact | Under 1% accuracy loss with staleness_limit=10 |
| Error feedback | Mathematically lossless over time |
| Convergence rate | 1.2-1.5x more rounds than centralized |

