Reciprocal Teaching: Dynamic Multi-Model Teacher-Student Learning for Multiple Noisy Annotations
Abstract
As datasets grow in size, expert-based annotation becomes increasingly impractical, making crowdsourcing a scalable and cost-effective alternative. In crowdsourcing, samples are typically annotated by multiple workers and aggregated via majority voting, a process that overlooks annotator-specific biases and introduces noisy labels that can impair downstream models. Traditional multi-rater methods attempt to model annotator biases (e.g., with transition matrices) but often overfit when faced with many classes or few, noisy annotators. By contrast, Learning with Noisy Labels (LNL) assumes a single noisy label per sample and has demonstrated that robust strategies (e.g., semi-supervised and multi-model learning) usually outperform bias-estimation methods, though these approaches remain underexplored in multi-annotator settings.To bridge this gap, we propose the Reciprocal Teacher-student Learning from Multi-rater Noisy Annotation (RETINA), a framework that integrates LNL techniques into multi-rater learning. RETINA trains annotator-specific models to capture individual labeling patterns and employs a dynamic teacher–student process, where the teacher identifies clean and noisy samples to guide the student. Experiments on synthetic and real-world benchmarks, including our proposed SynMRL benchmark, show that RETINA outperforms existing multi-rater methods, particularly in high-noise, low-annotator, many-class settings.