KZIAA - an image aesthetic assessment approach
We create a zero-shot approach for image aesthetic assessment.
This work is accepted by ICASSP 2024, you can get our paper here: https://ieeexplore.ieee.org/document/10447301/
and our codes will be released on this github repository. This blog will get you through our work briefly!
DURATION
9 months
ROLE
Code Lead
PROJECT TYPE
Research
MODELS
CLIP,BERT
MEMBERS
Guolong Wang∗(Advisor),
Yike Tan∗, Hangyu Lin, Chuchun Zhang (Students)
Problem Space
Image aesthetic assessment is an important issue in multimedia, but most existing studies employ supervised learning methods that rely on large-scale annotated data. However, aesthetic scoring annotations are difficult to obtain in large quantities.
Our Solution
Therefore, this paper explores zero-shot image aesthetic assessment. We predict aesthetic scores by introducing knowledge of different attributes (e.g., Focus).
First, we use prompt tuning to obtain a unique prompt for each aesthetic attribute as external knowledge.
Second, we leverage image relations considering sentiment polarity as internal knowledge.
Specifically, we obtain aesthetic attribute representations from pre-trained models via prompt learning, then select anchor images on specific attributes by sentiment polarity, computing aesthetic scores. Notably, annotated aesthetic scores are not used in the process.
Fig. 1. Overview of our research design.It contains three components: (1) continuous prompt, (2) relation with polarity, and (3) aesthetic score inference. The white rounded rectangle denotes data, and the white rectangle denotes a component. The blue and green rectangles denote the image and text, respectively. The dashed arrow means data input/output to the model, and the arrow means data flow in the model.
Here's the presentation ppt which includes more details: