Remote | Generalist – English & Brazilian Portuguese — $19.95/hr

San Francisco, California, United States
Full-Time
Remote

Job Description:

We are sharing a specialised remote opportunity for bilingual professionals fluent in English and Brazilian Portuguese to support a leading AI research initiative focused on improving general-purpose conversational AI systems.

This project focuses on evaluating and improving large language model (LLM) responses across a wide range of topics. The role involves reviewing AI-generated responses, fact-checking information, and providing structured feedback to improve accuracy, reasoning quality, and conversational clarity.

Key Responsibilities

Evaluate AI-generated responses based on their ability to effectively answer user queries
Conduct fact-checking using trusted public sources and external tools
Provide structured annotations identifying strengths, weaknesses, and factual inaccuracies
Assess reasoning quality, clarity, tone, and completeness of responses
Ensure outputs align with expected conversational standards and system guidelines
Apply consistent evaluation methods using established taxonomies, benchmarks, and evaluation frameworks

Ideal Profile

Strong candidates may have:

Native or near-native fluency in Brazilian Portuguese and professional fluency in English
Bachelors degree or equivalent educational background
Strong writing skills and the ability to provide clear, structured feedback
Experience using large language models and understanding common user interaction patterns
Strong attention to detail and ability to identify subtle reasoning or factual issues
Analytical thinking skills and comfort working across diverse topics and domains

Educational Background:

Bachelors degree in a field requiring structured analytical thinking such as research, policy, analytics, linguistics, engineering, or related areas

Nice to Have

Prior experience with RLHF, AI model evaluation, or data annotation
Experience writing or editing high-quality content
Experience comparing multiple outputs and making detailed qualitative judgments
Familiarity with evaluation rubrics, benchmarks, or structured quality scoring systems

Why This Opportunity

Contribute directly to improving conversational AI systems used by millions of users
Help ensure AI responses are accurate, clear, and aligned with real human expectations
Participate in human-in-the-loop AI development at the frontier of language model evaluation
Flexible remote work with the opportunity to contribute to high-impact AI research initiatives

Contract Details

Independent contractor role
Fully remote with flexible scheduling
Full-time or part-time engagement depending on availability
Rate: $19.95 per hour
Weekly payments via Stripe or Wise
Projects may extend or adjust depending on scope and performance
No access to confidential or proprietary information from employers or institutions

About the Platform

This opportunity is available through a leading AI-driven work platform.