Doctolib is seeking a Senior/Staff Machine Learning Engineer to join our AI Teams, focusing on Health Evaluation. In this role, you will be instrumental in designing, implementing, and scaling evaluation frameworks to ensure our AI Health Companion operates safely, reliably, and effectively for millions of patients and healthcare practitioners. You will collaborate with a cross-functional team of Machine Learning Engineers, Product Engineers, and Medical Experts to develop robust evaluation pipelines for agentic AI systems capable of reasoning, planning, and interacting with complex healthcare data.

Your primary responsibilities will include defining and owning the evaluation strategy for our AI agentic system, encompassing metrics, protocols, datasets, and tooling. You will implement and maintain automated evaluation pipelines to monitor model quality, safety, and alignment across iterations. Conducting systematic experiments to assess reasoning, factuality, robustness, and user experience will be key, as well as collaborating closely with model developers and research scientists to provide insights and drive iterative improvements. Additionally, you will contribute to research and internal knowledge sharing on large language model (LLM) evaluation methodologies and best practices.

The ideal candidate will possess an MSc or PhD in Computer Science, Machine Learning, Data Science, or a related field, along with over seven years of hands-on experience working with large language models such as GPT, Claude, Llama, or BERT-like architectures. Proven experience in evaluating agentic or reasoning systems, including autonomous agents, tool-using LLMs, dialogue systems, or task-oriented assistants, is essential. A strong track record in experiment design, metric definition, and evaluation automation is required, as well as the ability to bridge research and production, influencing modeling and product decisions. Excellent communication skills and a collaborative mindset are also crucial.

We offer a comprehensive benefits package, including free health insurance for you and your children, and a Parent Care Program that provides an additional month of leave on top of the legal parental leave. Free mental health and coaching services are available through our partner Moka.care. For caregivers and workers with disabilities, we offer a package that includes adaptation of the remote policy, extra days off for medical reasons, and psychological support. Our flexibility days policy allows you to work from EU countries and the UK for up to 10 days per year. Additionally, we provide a Work Council subsidy to refund part of sport club memberships or creative classes, up to 14 days of RTT, and lunch vouchers with a Swile card.

At Doctolib, we are committed to improving access to healthcare for everyone. We evaluate candidates based solely on qualifications and motivation, without any form of discrimination. We believe that the more diverse ideas are heard, the more our product will truly improve healthcare for all. You are welcome to apply to Doctolib, regardless of your gender, religion, age, sexual orientation, ethnicity, or disability. To ensure equal opportunities, we invite you to exclude personal information such as pictures or age from your applications. If you require any accommodation, please let us know for support during the hiring process.

Senior/Staff Machine Learning Engineer - Health Evaluation - AI Teams (x/f/m)

More Jobs at Doctolib

Senior Legal Counsel Digital (x/f/m)

Senior Database Reliability Engineer (x/f/m)

Media Account Manager (m/w/d) – Health & Prevention Campaigns | B2B

Account Executive (x/f/m) - Nürnberg