Abstract: Pre-trained vision-language models (VLMs) and language models (LMs) have recently garnered significant attention due to their remarkable ability to represent textual concepts, opening up new ...
No-Reference Image Quality Assessment (NR-IQA) focuses on designing methods to measure image quality in alignment with human perception when a high-quality reference image is unavailable. Most ...
Abstract: Recently, the accuracy of image-text matching has been greatly improved by multimodal pretrained models, all of which use millions or billions of paired images and texts for supervised model ...