A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs

Published in arXiv 2305.13525, 2024

Recommended citation: Siddharth Singh, Prajwal Singhania, Aditya Ranjan, Zack Sating, Abhinav Bhatele, "A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs." arXiv 2305.13525, 2024. https://arxiv.org/abs/2305.13525