Reliability Engineer, Supercomputing
🇺🇸 San Francisco, CA
$4K - $5K Annual
Posted 2 days ago
Expires August 31, 2026
Thinking Machines Lab is seeking a Reliability Engineer to ensure the dependability of its GPU supercomputing infrastructure. This role involves managing the interface between hardware, firmware, and operating systems to maintain optimal performance for large-scale AI research. The engineer will be responsible for diagnosing and resolving hardware-related issues, collaborating with vendors, and implementing solutions that support the lab's advanced AI experiments.
More Jobs at Thinking Machines Lab
Network Engineer, Supercomputing
Thinking Machines Lab
🇺🇸 San Francisco, CA
$4K - $5K Annual
Full TimeOn-siteEngineering
2 days ago
Software Engineer, Research Acceleration
Thinking Machines Lab
🇺🇸 San Francisco, CA
$4K - $5K Annual
Full TimeOn-siteEngineeringProduct+1
2 days ago
Strategic Finance Director
Thinking Machines Lab
🇺🇸 San Francisco, CA
$3K - $3K Annual
Full TimeOn-siteConsultingOperations+1
3 days ago
Software Engineer, Platform, Tinker
Thinking Machines Lab
🇺🇸 San Francisco, California
$4K - $5K Annual
Full TimeOn-siteEngineeringProduct
5 days ago