Research Engineer - Environments, Data and Post-Training
ABOUT MERCOR
Mercor is defining the future of work. We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development.
Our vast talent network trains frontier AI models in the same way teachers teach students: by sharing knowledge, experience, and context that can't be captured in code alone. Today, more than 30,000 experts in our network collectively earn over $2 million a day.
Mercor is creating a new category of work where expertise powers AI advancement. Achieving this requires an ambitious, fast-paced and deeply committed team. You’ll work alongside researchers, operators, and AI companies at the forefront of shaping the systems that are redefining society.
Mercor is a profitable Series C company valued at $10 billion. We work in-person five days a week in our San Francisco, NYC, or London offices.
ABOUT THE ROLE
As a Research Engineer at Mercor, you’ll work at the intersection of engineering and applied AI research. You’ll contribute directly to post-training and RLVR, synthetic data generation, and large-scale evaluation workflows that meaningfully impact frontier language models.
Your work will be used to train large language models to master tool use, agentic behavior, and real-world reasoning in real-world production environments. You’ll shape rewards, run post-training experiments, and build scalable systems that improve model performance. You’ll help design and evaluate datasets, create scalable data augmentation pipelines, and build rubrics and evaluators that push the boundaries of what LLMs can learn.
WHAT YOU’LL DO
- Work on post-training and RLVR pipelines to understand how datasets, rewards, and training strategies impact model performance.
- Design and run reward-shaping experiments and algorithmic improvements (e.g., GRPO, DAPO) to improve LLM tool-use, agentic behavior, and real-world reasoning.
- Quantify data usability, quality, and performance uplift on key benchmarks.
- Build a...