Diverse Sketch Colorization with Content-Enhanced Style Representation and Recolorization Distillation
Abstract
Sketch colorization is highly demanded in the field of art, as it offers a valuable tool for artists, designers, and illustrators to explore novel possibilities and express their creativity. Given a sketch, it can be colorized in various styles, such as rendering a facial sketch with different hair colors. To achieve this, previous works often resorted to sampling style codes from a simple Gaussian distribution to produce varied images. However, due to the inherent scarcity of information in sketches, the resulting colorized images often exhibit noticeable artifacts and lack of diversity. In this paper, our aim is to generate informative style representations that can effectively compensate for missing content information in sketches while enabling diverse generations. We improve style granularity by extracting style information from CLIP, and achieve the disentanglement of content and style by establishing semantic correspondence between sketches and color images in the CLIP space. Furthermore, to alleviate the artifacts and blending issues caused by semantic deficiency, we simultaneously train a recolorization model in an end-to-end manner. The recolorization model shares the same style space as the colorization model. In this way, we can construct multiple pseudo-sketch-image pairs for each sketch, which are used to provide pixel-level supervision for the colorization model, thus significantly facilitating the learning of semantic correspondence. Experiments demonstrate that our method effectively mitigates artifacts in colorized results and produces more semantically rich colors.