About the job
We’re building a TMX expert group to shape and improve cutting-edge AI models. If you’re a native-level English speaker with strong judgment, you’ll complete a short skills test and, if successful, join TMX’s paid expert network to work on high-impact AI evaluation tasks.
Now we are looking for Senior LLM Data Annotation Specialist (1+ year experience preferred)
Type: Contract
Location: Remote, long-term UK-based
Commitment: 20+ hours/week
✅ Application Process
- Click Apply to complete the short information form
- Complete a brief qualification task if requested (sent via email after we receive your form)
- Follow up within 1-2 weeks with the next steps.
----------------------------------------------------------
Role Responsibilities
- Interpret user prompts to identify underlying intent, goals, and expected model capabilities.
- Apply detailed annotation guidelines while exercising strong linguistic and contextual judgment in complex or ambiguous cases.
- Compare, evaluate, and annotate responses from multiple LLMs, assessing whether outputs meet user needs and identifying gaps or weaknesses, in accordance with established evaluation frameworks and quality standards.
- Classify model responses by intent and task type to support structured analysis and model performance insights.
- Perform detailed quality reviews of annotated outputs, ensuring scoring consistency, labeling accuracy, and alignment with evaluation guidelines.
- Participate in disagreement analysis (diff review) and feedback loops to improve evaluation consistency and overall data quality.
- Manage tasks independently with flexibility, meeting accuracy and delivery expectations across multiple assignments.
- Collaborate with the POC on task guidance and ramp-up for each assignment.
Qualifications Must-Have
- Native in English.
- Strong reading comprehension and attention to detail.
- Ability to interpret complex guidelines and make well-reasoned quality judgments independently.
- Familiarity with LLMs, chatbots, or AI-generated content; experience with model evaluation or comparison is a plus.
- 1+ year experience in data annotation, AI evaluation, content review, QA, or related roles is preferred.
Compensation & Legal
- Competitive hourly rates, adjusted for geography.
- Classified as an independent contractor.
PS: Our team reviews applications daily. Please complete your application steps to be considered for this opportunity.