Oral Session
Oral Session 3A: Low-level and Physics-based Vision
BrightRate: Quality Assessment for User-Generated HDR Videos
Shreshth Saini ⋅ Bowen Chen ⋅ Yilin Wang ⋅ Neil Birkbeck ⋅ Balu Adsumilli ⋅ Alan Bovik
High Dynamic Range (HDR) videos offer superior luminance and color fidelity as compared to Standard Dynamic Range (SDR) content. The rapid growth of User-Generated Content (UGC) on platforms such as YouTube, Instagram, and TikTok has brought a significant increase in the volumes of streamed and shared UGC videos. This newer category of videos brings new challenges to the development of effective No-Reference (NR) video quality assessment (VQA) models specialized to HDR UGC, because of the extreme variety and severities of distortions, arising from diverse capture, editing, and processing outcomes. Towards addressing this issue, we introduce BrightVQ, a sizeable new psychometric data resource. It is the first large-scale subjective video quality database dedicated to the quality modelling of HDR UGC videos. BrightVQ comprises 2,100 videos, on which we collected 73,794 perceptual quality ratings. Using this dataset, we also developed BrightRate, a novel video quality prediction model designed to capture both UGC-specific distortions coexisting with HDR-specific artifacts. Extensive experimental results demonstrate that BrightRate achieves state-of-the-art performance across HDR databases. Project page: https://brightvqa.github.io/BrightVQ/
Reviving Unsupervised Optical Flow: Concept Reevaluation, Multi-Scale Advances and Full Open-Source Release
Azin Jahedi ⋅ Marc Rivinius ⋅ Noah Senn ⋅ Andres Bruhn
Unsupervised optical flow methods have become more popular in the last decade, enabling the training of models across domains without ground truth data. Although RAFT and its successors have achieved significant success in the supervised settings, many unsupervised approaches continue to use older backbones such as PWC-Net. One reason for this architectural stagnation is that the current RAFT-based SOTA approach has proven challenging for the community to reproduce. In this paper, we revive and advance unsupervised optical flow: First, we introduce Sun-RAFT: a simple unsupervised RAFT. Second, building on Sun-RAFT, we present Muun-RAFT: a novel multi-scale unsupervised RAFT, where we propose a gradual context-based upsampling to refine the flow, further improving both accuracy and preservation of details. Third, we reexamine previously advised unsupervised strategies to identify effective training settings. In terms of results, both our methods demonstrate strong generalization capabilities and set a new SOTA for unsupervised two-frame approaches on MPI-Sintel, with Muun-RAFT surpassing even the current multi-frame SOTA by up to 28%. Finally, we open-source our PyTorch code, enabling further developments in the field: https://cv-stuttgart.github.io/Reviving-Unsupervised-OpticalFlow.
UniCoRN: Latent Diffusion-based Unified Controllable Image Restoration Network across Multiple Degradations
Debabrata Mandal ⋅ Soumitri Chattopadhyay ⋅ Guansen Tong ⋅ Praneeth Chakravarthula
Image restoration is essential for enhancing degraded images across computer vision tasks. However, most existing methods address only a single type of degradation (e.g., blur, noise, or haze) at a time, limiting their real-world applicability where multiple degradations often occur simultaneously. In this paper, we propose UniCoRN, a unified image restoration approach capable of handling multiple degradation types simultaneously using a multi-head diffusion model. Specifically, we uncover the potential of low-level visual cues extracted from images in guiding a controllable diffusion model for real-world image restoration, and design a multi-head control network adaptable via a mixture-of-experts strategy. We train our model without any prior assumption of specific degradations, through a smartly designed curriculum learning recipe. Additionally, we also introduce MetaRestore, a metalens imaging benchmark containing images with multiple degradations and artifacts. Extensive evaluations on several challenging datasets, including our benchmark, demonstrate that our method achieves significant performance gains and can robustly restore images with severe degradations. Our code and datasets will be open-sourced upon acceptance.
DRWKV: Focusing on Object Edges for Low-Light Image Enhancement
Xuecheng Bai ⋅ Yuxiang Wang ⋅ Boyu Hu ⋅ Qinyuan Jie ⋅ Chuanzhi Xu ⋅ Kechen Li ⋅ Hongru Xiao ⋅ Yuk Chung
Low-light image enhancement remains a challenging task, particularly in preserving object edge continuity and fine structural details under extreme illumination degradation. In this paper, we propose a novel model, DRWKV (Detailed Receptance Weighted Key Value), which integrates our proposed Global Edge Retinex (GER) theory, enabling effective decoupling of illumination and edge structures for enhanced edge fidelity. Secondly, we introduce Evolving WKV Attention, a spiral-scanning mechanism that captures spatial edge continuity and models irregular structures more effectively. Thirdly, we design the Bilateral Spectrum Aligner (Bi-SAB) and a tailored MS²-Loss to jointly align luminance and chrominance features, improving visual naturalness and mitigating artifacts. Extensive experiments on five LLIE benchmarks demonstrate that DRWKV achieves leading performance in PSNR, SSIM, and NIQE while maintaining low computational complexity. Furthermore, DRWKV enhances downstream performance in low-light multi-object tracking tasks, validating its generalization capabilities.