AI Red Teaming 3-5 Years of Experience | Remote (US-based) | High-Ownership, High-Impact

About 10a Labs: 10a Labs is an applied research and AI security company trusted by AI unicorns, Fortune 10 companies, and U.S. tech leaders. We combine proprietary technology, deep expertise, and multilingual threat intelligence to detect high-risk content at scale. We also deliver state-of-the-art red teaming across high-impact security and safety challenges.

About the Role: We’re looking for an applied data scientist with strong engineering instincts to lead the technical development and red teaming strategy for a mission-critical classification system. This role requires deep analytical thinking and technical execution. You’ll lead the design of evaluation frameworks, own quality metrics, and run adversarial testing initiatives to strengthen system performance. You’ll also coordinate with ML engineers and infrastructure teams to ensure end-to-end product readiness and robustness.

In This Role, You Will:

  • Design and oversee the technical implementation of a robust red teaming project.
  • Develop evaluation frameworks, performance metrics, and model validation strategies aligned with safety goals.
  • Lead adversarial testing efforts (e.g., red teaming, evasion probes, jailbreak simulation).
  • Work with researchers and domain experts to define labeling schemas and edge-case tests.
  • Partner with ML and infrastructure engineers to ensure production readiness, observability, and performance targets.
  • Communicate technical strategy and tradeoffs clearly across internal and client teams.

We’re Looking for Someone Who:

  • Has 3-5 years of experience in applied data science, ML product work, or security-focused AI, including technical leadership or staff-level ownership.
  • Has designed and evaluated real-world ML systems with a focus on model behavior, error analysis, and continuous improvement.
  • Can design red teaming workflows to surface model blind spots and failure modes.
  • Operates effectively across ML, infra, and policy / strategy contexts.
  • Thinks like a builder, analyst, and communicator all in one.

Requirements:

  • Background in data science, applied ML, or ML engineering, with proven experience in production-grade systems.
  • Strong analytical toolkit (Python, SQL, Jupyter, scikit-learn, Pandas, etc.) and familiarity with modern ML tooling (e.g., PyTorch, Hugging Face, LangChain).
  • Experience working with LLMs or embedding-based classification systems.
  • Excellent communication skills and ability to guide teams across strategy and technical domains.

Nice to Have Experience With:

  • Safety evaluation, red teaming, or adversarial content testing in LLMs.
  • Trust & safety or risk-focused classification systems.
  • Annotation ops, feedback loops, or evaluation pipeline design.
  • Experience with open-source model evaluation tools (Promptfoo, DeepEval, etc.)

 


 


 

Salary

$120,000 - $180,000

Yearly based

Location

United States

Remote Job

Job Overview
Job Posted:
14 hours ago
Job Expires:
1mo 4w
Job Type
Full Time

Share This Job: