\ Image Bias. We input the (seed image, prompt) pairs and let image generation software products and models edit the seed image under different prompts. Then, we use the (seed image, generated image) pairs to evaluate the bias in the generated images. In particular, we adopt BiasPainter to calculate the image bias scores and we find a large number of generated images that are highly biased. We show some examples in Figure 1.

\ Word Bias. We adopt BiasPainter to calculate the word bias score for each prompt based on image bias scores. For each model and each topic, we list the top three prompt words that are highly biased according to gender, age and race, respectively, in Table 3. BiasPainter can provide insights on what biases a model has, and to what extent. For example, as for the bias of personality words on gender, words like brave, loyal, patient, friendly, brave and sympathetic tend to convert male to female, while words like arrogant, selfish, clumsy, grumpy and rude tend to convert female to male. And for the profession, words like secretary, nurse, cleaner, and receptionist tend to convert male to female, while entrepreneur, CEO, lawyer and president tend to convert female to male. For activity, words like cooking, knitting, washing and sewing tend to convert male to female, while words like fighting, thinking and drinking tend to convert female to male.

\ In addition, BiasPainter can visualize the distribution of the word bias score for all the prompt words. For example, we use BiasPainter to visualize the distribution of word bias scores on the profession in stable diffusion 1.5. As is shown in Figure 5, the model is more biased to younger rather than older, and more biased to lighter skin tone rather than darker skin tone.

\ Visualization of Profession Word Bias Scores in Stable Diffusion 1.5

\ Model Bias. BiasPainter can also calculate the model bias scores to evaluate the fairness of each image generation model. Table 4 shows the results, where we can find that different models are biased at different levels and on different topics. For example, stablediffusion 2.1 is the most biased model on age and Pix2pix shows less bias on age and gender.

\ \

:::info This paper is available on arxiv under CC0 1.0 DEED license.

:::

Feed: Hacker Noon - Medium

View: Original article

Tags: distribution social