AnyBald: Toward Realistic Diffusion-Based Hair Removal In-The-Wild
Abstract
We present AnyBald, a novel framework for realistic hair removal from portrait images captured under diverse in-the-wild conditions. One of the key challenges in this task is the lack of high-quality paired data, as existing datasets are often low-quality, with limited viewpoint variation and overall diversity, making it difficult to handle real-world cases. To address this, we construct a scalable data augmentation pipeline that synthesizes high-quality hair and non-hair image pairs capturing diverse real-world scenarios, enabling effective generalization with the added benefit of scalable supervision. With this enriched dataset, we present a new hair removal framework that reformulates pretrained latent diffusion inpainting using learnable text prompts, removing the need for explicit masks at inference. In doing so, our model achieves natural hair removal with semantic preservation via implicit localization. To further improve spatial precision, we introduce a regularization loss that guides the model to focus attention specifically on hair regions. Extensive experiments demonstrate that AnyBald outperforms in removing hairstyles while preserving identity and background semantics across various in-the-wild domains.