AI Safety & Adversarial Testing Contract - 3 months

Company Description

T3 partners with organizations deploying production AI systems in high-risk environments where failures can have significant regulatory, operational, or safety implications. With a team instrumental in shaping global AI standards and governance frameworks, T3 provides AI assurance services to major Big Tech companies and complex enterprises.

This is a three month contract with the opportunity to extend.

Candidates must have direct experience working with frontier labs or large technology companies on safety evaluation, red teaming, or adversarial testing. Experience designing and operationalising testing frameworks for production grade generative models is essential.

Role Description

Support the design and execution of a structured adversarial testing framework across LLM, image, and video generation models. Responsible for developing the SOP, adversarial methodology, prompt expansion strategy, and delivery of formal testing reports aligned to client policy.

It requires deep safety domain expertise combined with hands on testing capability. The role will report to a strategic lead.

Core Responsibilities

1. Adversarial Testing Framework Design

Define what constitutes truly adversarial prompts
Develop taxonomy of attack types and failure modes
Translate real user behaviour into structured attack vectors
Define evaluation methodology across LLM, image, and video models
Create severity and risk classification frameworks

2. Test Set Development & Expansion

Augment existing prompt libraries with adversarial variants
Design systematic prompt mutation strategies
Develop model specific adversarial patterns
Define coverage metrics for risk domains
Identify blind spots in current test libraries

3. Evaluation & Execution

Run adversarial tests against selected model versions
Classify and analyse failure types
Map outputs against internal policy requirements
Produce structured evaluation findings
Support reuse of client internal evaluation platform

4. Reporting & Stakeholder Communication

Produce formal testing reports
Present findings to technical and policy audiences
Clearly distinguish methodology from execution
Define remediation pathways and improvement loops

Required Skills & Experience

Technical

Strong background in AI safety, red teaming, or adversarial ML
Experience testing LLMs and generative models
Familiarity with prompt injection, jailbreaks, and boundary attacks
Understanding of multimodal models (text, image, video)
Experience defining structured evaluation frameworks
Knowledge of benchmark design and failure taxonomy creation

Domain Knowledge

Responsible AI principles
Risk classification frameworks
Safety policy interpretation
Alignment evaluation concepts
Model evaluation lifecycle

Analytical

Ability to design rigorous testing methodology
Strong failure analysis capability
Quantitative and qualitative evaluation skills
Ability to convert abstract risks into concrete test cases

Soft Skills

Comfortable operating in ambiguous scope environments
Clear communicator across technical and policy stakeholders
Able to work without creating single point dependency risk
Structured thinker

Ideal Background

AI safety researcher
Red teaming lead for generative AI systems
AI evaluation specialist
Experience in frontier or production generative systems
Experience with model benchmarking and structured evaluation labs

Apply Now

AI Safety & Adversarial Testing Contract - 3 months

Job Details