As an AI QA Trainer specializing in Large Language Model (LLM) evaluation, you will play a pivotal role in enhancing the reasoning and reliability of advanced AI systems. This freelance position offers the opportunity to work remotely with a dynamic team dedicated to advancing AI capabilities across various applications.

Your primary responsibilities will include designing and executing test plans, developing clear evaluation rubrics, and conducting comprehensive assessments of LLMs. You will engage in tasks such as detecting hallucinations, ensuring factual consistency, testing prompt robustness, and evaluating bias and fairness. Additionally, you will document failure modes and suggest improvements to prompt engineering and system guardrails.

The ideal candidate will possess a bachelor's, master's, or PhD in computer science, data science, computational linguistics, statistics, or a related field. Experience in quality assurance for machine learning or AI systems, familiarity with test automation frameworks like PyTest, and hands-on experience with LLM evaluation tools are highly desirable. Strong skills in evaluation rubric design, adversarial testing, regression testing, and prompt engineering are essential. Clear and effective communication is crucial for success in this role.

Compensation for this role ranges from $6 to $65 per hour, depending on experience, expertise, and geographic location. As a contractor, you will need to provide a secure computer and high-speed internet connection. Please note that company-sponsored benefits such as health insurance and paid time off do not apply to this position.

This role offers the chance to contribute significantly to the development of cutting-edge AI technologies. By joining our team, you will have the opportunity to shape the future of AI applications and work in a collaborative environment that values innovation and quality.

AI QA Trainer - LLM Evaluation - Freelance Project