ABOUT THE ROLE

We are looking for an Inference Engineering Manager to lead our AI Inference team. This is a unique opportunity to build and scale the infrastructure that powers Perplexity's products and APIs, serving millions of users with state-of-the-art AI capabilities.

You will own the technical direction and execution of our inference systems while building and leading a world-class team of inference engineers. Our current stack includes Python, PyTorch, Rust, C++, and Kubernetes. You will help architect and scale the large-scale deployment of machine learning models behind Perplexity's Comet, Sonar, Search, Deep Research products.

WHY PERPLEXITY?

- Build SOTA systems that are the fastest in the industry with cutting-edge technology

- High-impact work on a smaller team with significant ownership and autonomy

- Opportunity to build 0-to-1 infrastructure from scratch rather than maintaining legacy systems

- Work on the full spectrum: reducing cost, scaling traffic, and pushing the boundaries of inference

- Direct influence on technical roadmap and team culture at a rapidly growing company

RESPONSIBILITIES

- Lead and grow a high-performing team of AI inference engineers

- Develop APIs for AI inference used by both internal and external customers

- Architect and scale our inference infrastructure for reliability and efficiency

- Benchmark and eliminate bottlenecks throughout our inference stack

- Drive large sparse/MoE model inference at rack scale, including sharding strategies for massive models

- Push the frontier with building inference systems to support sparse attention, disaggregated pre-fill/decoding serving, etc.

- Improve the reliability and observability of our systems and lead incident response

- Own technical decisions around batching, throughput, latency, and GPU utilization

- Partner with ML research teams on model optimization and deployment

- Recruit, mentor, and develop engineering talent

- Establish team p...

Engineering Manager - Inference

More Jobs at Perplexity

Legal Operations

Full Stack Software Engineer

Internship - Search Machine Learning Engineer

Community & Field Marketer