Principal ML Engineer, ML Platform Engineering
Xometry is seeking a Principal Machine Learning Engineer to join our core machine learning platform engineering team. In this role, you will partner closely with the AI/MLE leadership team to deliver the vision and technical implementation for the foundational infrastructure leveraged by Xometry’s AI/ML solutions, including the Instant Quoting Engine® and other AI/ML products powering the Xometry marketplace.
This will be a high visibility role working hands-on to deliver a core aspect of the Xometry ecosystem. You will be given the opportunity to continually challenge yourself, drive innovation, have ownership of your work, and play a crucial role in the Xometry platform.
Responsibilities:
- Hands-On Technical Leadership: Adopt a 'lead by example' approach by actively coding and troubleshooting, as well as creating documentation and technical diagrams.
- Teaching & Mentorship: You will serve as a mentor and guide to engineers across the organization, teaching and mentoring them to grow their skills.
- Code Review: You will do code review and mentor others within the organization regarding best practices in ML Engineering.
- Operational Excellence: Guarantee the delivery of superior infrastructure and software that not only meets but exceeds customer expectations, while aligning with the strategic business timelines.
- Collaborative Strategy: Forge strong partnerships with product managers, data scientists, and company leadership to promote a culture of open communication and integrated team dynamics.
- Guide Innovation: Champion the adoption of cutting-edge technologies, methodologies, and practices to enhance problem-solving efficiency and effectiveness across the AI/ML organization.
Qualifications:
- At least 7 years of experience in machine learning engineering, software engineering, data science, or similar technical role.
- A bachelor’s degree is required, but an advanced degree (M.S. or PhD) in computer science, machine learning, AI, or a related field is preferred and may substitute for some years of experience.
- Demonstrated experience designing and deploying cloud infrastructure (AWS preferred) to support machine learning, and machine learning models, with considerations for scale, reliability and security.
- Deep understanding of the machine learning lifecycle and related infrastructure needs - feature stores, a/b testing, model registration, drift detection, automated retraining, etc.
- Strong technical expertise. You will need to either have or demonstrate the ability to quickly build technical expertise in the following:
- Software engineering principles, including parallel and distributed computing, version control, reproducibility, and continuous integration.
- Machine learning techniques and algorithms, with emphasis on their impact to infrastructure implementation
- Including large-scale language and vision models (Transformers, GPT, VLMs, LLMs), deep learning (PyTorch, Tensorflow)
- Infrastructure as Code (IaC), especially Terraform
- REST API design and implementation
- Object oriented and functional programming in Python
- Multimodal data processing (e.g., combining text, image, and 3D data).
- Experience with AWS microservices including SageMaker, Service Catalog, IAM, Lambda, Cloudwatch, ECR, EKS, and Kinesis
- Containerization technologies (Docker and Kubernetes)
- Demonstrated ability to interact and communicate effectively at all levels of the organization, from executives to product managers and a wide variety of stakeholders and contributors
- Experience in the manufacturing, supply chain, or similar industries is a plus.
- Must be a US Citizen or Green Card holder (ITAR)
The estimated base salary range for new hires into this role is $140,000- $182,000 annually + annual bonus depending on factors such as job-related skills, relevant experience, and location. We also offer a competitive benefits package, including 401(k) match, medical, dental and vision insurance; life and disability insurance; generous paid time off including vacation, sick leave, floating and fixed holidays, maternity and bonding leave; EAP, other wellbeing resources; and much more.