Member of Technical Staff - ML Infrastructure Engineer

🇩🇪 Freiburg, Baden-Württemberg
$2K - $3K Annual
Posted 20 months ago
Expires July 11, 2026
Full TimeHybridEngineeringOperations

Black Forest Labs is seeking a Member of Technical Staff - ML Infrastructure Engineer to join our team. We are the creators of Latent Diffusion, Stable Diffusion, and FLUX—technologies that have transformed image and video creation. Our mission is to develop generative models that empower millions of creators, developers, and businesses worldwide. With headquarters in Freiburg, Germany, and a growing presence in San Francisco, we are expanding rapidly while maintaining our commitment to research excellence, open science, and enhancing human creativity.

In this role, you will design, deploy, and maintain the machine learning infrastructure that underpins our cutting-edge AI research. Your work will directly influence the success of extensive training runs, the efficiency of inference processes, and the agility of our research iterations. Responsibilities include managing cloud-based ML training clusters using Slurm, overseeing inference clusters with Kubernetes, implementing network-based cloud file systems optimized for large-scale ML workloads, and developing Infrastructure as Code (IaC) for resource provisioning. Additionally, you will optimize CI/CD pipelines for ML workflows, design custom autoscaling solutions, enforce security best practices, and provide tools that streamline ML operations.

The ideal candidate has substantial experience in building and managing ML infrastructure at scale, with a deep understanding of the unique challenges in supporting AI research. Proficiency in cloud platforms such as AWS, Azure, or GCP, with a focus on ML/AI services, is essential. Extensive experience with Kubernetes and Slurm cluster management in production environments is required, along with expertise in IaC tools like Terraform or Ansible. A proven track record in managing and optimizing network-based cloud file systems and object storage for ML workloads is crucial. Familiarity with CI/CD tools, security principles in cloud environments, monitoring and observability tools, ML workflows, and GPU infrastructure management is also expected. The ability to handle complex migrations and implement breaking changes in production environments without compromising data integrity is vital.

We offer a competitive base annual salary ranging from $180,000 to $300,000 USD. Our hybrid work model includes at least two days per week on-site in Freiburg or San Francisco, or one full week every other week, with monthly in-person weeks to foster team cohesion. We cover reasonable travel costs to facilitate this arrangement. Our company culture emphasizes research excellence, open science, and building technology that expands human creativity. We value low ego, boldness, and kindness, fostering an environment where the best ideas prevail, ambitious bets are taken, and genuine warmth and empathy are prioritized.

If you are passionate about building and maintaining the infrastructure that drives frontier AI research and want to be part of a team that values innovation and collaboration, we encourage you to apply.

More Jobs at Black Forest Labs