Principal Engineer, Cluster Orchestration

🇺🇸 Sunnyvale, CA
$2K - $3K Annual
Posted 3 months ago
Expires June 9, 2026
Full TimeOn-siteEngineeringOperations

CoreWeave is seeking a Principal Engineer to lead the design and evolution of its cluster orchestration systems, which support large-scale GPU clusters for AI training and inference. This role is integral to CoreWeave's mission of delivering high-performance cloud solutions tailored for AI workloads.

The Principal Engineer will be responsible for defining the long-term architecture of CoreWeave's orchestration platforms, including Kubernetes, Slurm, SUNK, and related systems. Key responsibilities include acting as a technical authority on scheduling, quota enforcement, and multi-tenant GPU isolation, as well as making design decisions that balance performance, reliability, cost, and operational complexity.

Candidates should have over 15 years of experience in building and operating large-scale distributed systems, with deep knowledge of Kubernetes and Slurm internals. Experience with GPU-heavy platforms for AI training or HPC workloads, proficiency in Go and cloud-native systems development, and a proven ability to set technical direction across teams are essential. A Bachelor's or Master's degree in a relevant field, or equivalent experience, is required.

The base salary for this position ranges from $206,000 to $303,000, determined based on job-related knowledge, skills, experience, and market location. The total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program.

CoreWeave fosters a culture focused on innovative disruption, offering a casual work environment and opportunities for professional growth. Employees are encouraged to take ownership, collaborate across teams, and contribute to the company's mission of accelerating AI breakthroughs through superior infrastructure performance.

More Jobs at CoreWeave