Evolving to the Aesthetics of a Vision-Language Model
Title: Adapting to the Visual Qualities of Vision-Language Models
Abstract: While evolutionary algorithms have achieved significant success in creative fields—such as music composition, graphic design, and generative typography—a persistent challenge lies in crafting fitness functions that accurately reflect the aesthetic qualities of abstract results. This study investigates two distinct approaches for assessing the aesthetic value of a design population through the lens of Vision-Language Models (VLMs). The initial approach employs CLIP-IQA to generate a specific aesthetic score for every individual design. In contrast, the second strategy utilizes a competitive format where candidates are matched against one another; a VLM, guided by a user-defined custom prompt, determines the victor of each pair. These pairwise outcomes are subsequently aggregated to establish a comprehensive population ranking using the Glicko rating system. We illustrate these methodologies through a case study involving a bespoke generative framework, juxtaposing the resulting rankings against those derived from an artist’s subjective evaluation and other established aesthetic metrics. Furthermore, we provide a critical examination of the artist’s practical experience with these evolutionary techniques, highlighting the respective advantages and limitations of each assessment method.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





