From Cognitive Priors to Instance Semantics: A Unified Framework for Multi-task Affective Computing
Abstract
Understanding human affect via Valence-Arousal, Expressions, and Action Unit is essential for human-machine interaction. While recent multi-task learning (MTL) methods aim to unify these tasks, they often overlook three key challenges: (i) the assumption of complete annotations for all tasks, which leads to underutilization of single-task datasets with disjoint labels; (ii) task conflicts arising from Noisy Gradients, Negative Transfer (NT), and Task-specific Performance Misalignment (TPM); and (iii) the absence of unified modeling across all three affective task types: regression, detection, and classification.We introduce COIN, a novel two-stage MTL framework that bridges Cognitive Priors and Instance Semantics for robust MTL training.At first, we propose a cognitively guided cross-task label induction strategy that enables supervision propagation under sparse annotations and mitigates NT, resulting in strong task-specific experts. Then, we propose two complementary branches to tackle TPM:(i) transfers knowledge from task-optimal experts and jointly optimizes task-specific objectives under partial supervision;(ii) enhances visual-language consistency using Class-Conditioned Prompts \& Instance-Adaptive Prompts.Experiments demonstrate the framework's ability to achieve robust cross-task \& generalization performance across six diverse datasets.Code and models will be released upon paper's acceptance.