Infrastructure and MLOps Engineer
Graphcore is seeking an Infrastructure and MLOps Engineer to join their Software Infrastructure team. This role focuses on developing and maintaining tools and services that support AI research and engineering teams, enhancing the build, test, deployment, and productization processes of Machine Learning Software components. The position offers the opportunity to work with High-Performance Computing (HPC) AI platforms and gain experience in distributed systems.
Key responsibilities include developing, owning, and maintaining tools and services to support AI research and engineering teams. The role also involves deploying and maintaining services using Kubernetes and Docker, as well as managing cloud infrastructure with tools such as Terraform.
The ideal candidate should have knowledge of Python and familiarity with cloud services like AWS. Experience in managing or developing in Linux environments and an understanding of CI/CD principles are essential. Proficiency in using Kubernetes is also required. Additionally, experience in one of the following areas is necessary: maintaining machine learning applications, deploying ML orchestration tools (e.g., NV Ray, KFP, SkyPilot), or managing ML accelerator hardware (e.g., DCGM).
Graphcore offers a competitive salary along with flexible working arrangements. Benefits include a generous annual leave policy, private medical insurance, a health cash plan, a dental plan, pension matched up to 5%, life assurance, and income protection. The company also provides a generous parental leave policy and an employee assistance program covering health, mental wellbeing, and bereavement support. Employees can enjoy a range of healthy food and snacks at the central Bristol office, which also features a barista bar.
Graphcore is committed to building an inclusive work environment that welcomes people from diverse backgrounds and experiences. The company fosters a culture of continuous learning and constant innovation, providing opportunities for employees to make a significant impact on the company's products and the future of artificial intelligence.