Staff Software Engineer, ML Performance & Systems

🇺🇸 San Francisco, CA
$2K - $3K Annual
Posted 10 months ago
Expires July 12, 2026
Full TimeOn-siteEngineeringData Science

As a Staff Software Engineer specializing in Machine Learning (ML) Performance and Systems at fal, you will play a pivotal role in advancing the performance of generative media models. Fal is a leading generative media ecosystem that provides infrastructure, tools, and model access to facilitate the development of AI products. This position is based in downtown San Francisco, where you will collaborate closely with the Applied ML team and clients to enhance model serving architectures.

Your primary responsibilities will include designing and implementing innovative model serving architectures atop fal's proprietary inference engine, with a focus on maximizing throughput while minimizing latency and resource consumption. You will develop performance monitoring and profiling tools to identify bottlenecks and optimization opportunities. Additionally, you will work closely with the Applied ML team and customers to ensure their workloads benefit from fal's accelerator.

The ideal candidate will have a strong foundation in systems programming with expertise in identifying and resolving performance bottlenecks. A deep understanding of cutting-edge ML infrastructure stacks, including tools like PyTorch, TensorRT, TransformerEngine, and Nsight, is essential. Proficiency in Triton or comparable experience in lower-level accelerator programming is required. Familiarity with Nvidia-based systems and the ability to delve into the stack to address bottlenecks, such as developing custom GEMM kernels with CUTLASS, is highly desirable. Experience with multi-dimensional model parallelism and knowledge of internals like Ring Attention, FA3, and FusedMLP implementations are also beneficial.

Fal offers a competitive salary ranging from $180,000 to $250,000, along with equity and a comprehensive benefits package. Benefits include health, dental, and vision insurance, as well as regular team events and offsite activities. The company provides relocation assistance to San Francisco and offers visa sponsorship for eligible candidates.

Joining fal means becoming part of a dynamic team at the forefront of AI and generative media. The company fosters a collaborative and innovative culture, providing ample opportunities for learning and professional growth. If you are passionate about pushing the boundaries of ML performance and systems, this role offers a unique opportunity to make a significant impact in the field.

More Jobs at FAL