GPU Placement Advisor for PyTorch/NCCL Workloads
Dystrio analyzes your PyTorch distributed training communication patterns and generates Kubernetes pod affinity rules to co-locate GPUs that talk the most.
Add this to your training script:
from torch.profiler import profile, ProfilerActivity
with profile(
activities=[ProfilerActivity.CPU, ProfilerActivity.CUDA],
record_shapes=True,
with_stack=True
) as prof:
# Your training step here
model(inputs)
prof.export_chrome_trace("trace.json")
Upload the resulting trace.json file.
Single run: Leave Session ID empty. You'll get recommendations based on one trace.
Multi-run (recommended): Use the same Session ID across multiple uploads. Dystrio tracks which communication patterns are stable vs noisy, giving you higher-confidence recommendations.
Example: Upload 3 traces from different training runs with Session ID "llama-70b-training" → Dystrio identifies consistent patterns and escalates confidence from LOW → HIGH.
affinity: block to your Pod spec