support relations for AI‐generated statements. They will contribute definitions, examples, and decision rules that make the taxonomy operational for both human annotators and LLM‐as‐judge evaluators. The intern will design a benchmark : selecting suitable source corpora (including recent groundedness datasets), constructing statement–source pairs, and writing clear annotation … potentially crowdsourcing. Where applicable, they will help prepare bespoke annotation tooling. The intern will evaluate frontier models’ ability to classify grounding categories and compare LLM‐as‐judge performance to human raters. They will co‐author an academic paper describing the taxonomy, dataset, and findings. Qualifications Required/Minimum Qualifications ...