ICML 2023 Men Also Do Laundry: Multi-Attribute Bias Amplification Interpretable metrics for measuring bias amplification from multiple attributes

Dora Zhao, Sony AI
Jerone Andrews, Sony AI
Alice Xiang, Sony AI

Read paper
View code
Watch video
Download BibTeX

Background and related work

As computer vision systems become more widely deployed, there is increasing concern from both the research community and the public that these systems are not only reproducing but amplifying harmful social biases. The phenomenon of bias amplification, which is the focus of our work, refers to models amplifying inherent training set biases at test time1.

There are two main approaches to quantifying bias amplification in computer vision models:

  • Leakage-based metrics2, 3, measuring the change in a classifier’s ability to predict group membership from the training data to predictions.
  • Co-occurrence-based metrics1, 4, measuring the change in ratio of a group and single attribute from the training data to predictions.

Existing metrics1, 4 measure bias amplification with respect to single annotated attributes (e.g., `computer`). However, large-scale visual datasets often have multiple annotated attributes per image. For example, in the COCO dataset5, 78.8% of the training set of images are associated with more than a single attribute (i.e., object).

More importantly, considering multiple attributes can reveal additional nuances not present when considering only single attributes. In the imSitu dataset6, individually the verb `unloading` and location `indoors` are skewed `male`. However, when considering {`unloading`, `indoors`} in conjunction, the dataset is actually skewed `female`. Significantly, men tend to be pictured unloading packages `outdoors` whereas women are pictured `unloading` laundry or dishes `indoors`. Even when men are pictured `indoors`, they are `unloading` boxes or equipment as opposed to laundry or dishes. Models can similarly leverage correlations between a group and either single or multiple attributes simultaneously.

Bias scores (i.e., gender ratios) of the verbs `pouring` and `unloading` as well as location `indoors` in imSitu. While imSitu is skewed `male` for the single attributes, the multi-attributes (e.g., {`pouring`, `indoors`}) are skewed `female`. Note that we have replaced imSitu images with Adobe Stock images for privacy reasons.

Key takeaways

We propose two multi-attribute bias amplification metrics that evaluate bias arising from both single and multiple attributes.

  • We are the first to study multi-attribute bias amplification, highlighting that models can leverage correlations between a demographic group and multiple attributes simultaneously, suggesting that current single-attribute metrics underreport the amount of bias being amplified.
  • When comparing the performance of multi-label classifiers trained on the COCO, imSitu, and CelebA7 datasets, we empirically demonstrate that, on average, bias amplification from multiple attributes is greater than that from single attributes.
  • While prior works have demonstrated that models can learn to exploit different spurious correlations if one is mitigated8, 9, we are the first to demonstrate that bias mitigation methods1, 2, 10, 11 for single attribute bias can inadvertently increase multi-attribute bias.

Multi-attribute bias amplification metrics

We denote by G={g1,,gt}\mathcal{G}=\{g_1, \dots, g_t\} and A={a1,,an}{\mathcal{A}=\{a_1,\dots,a_n\}} a set of tt group membership labels and a set of nn attributes, respectively. Let M={m1,,m}{\mathcal{M}=\{m_1,\dots,m_{\ell}\}} denote a set of \ell sets, containing all possible combinations of attributes, where mim_i is a set of attributes and mi{1,,n}\lvert m_i \rvert \in \{1,\dots, n\}. Note mMm\in\mathcal{M} if and only if co-occur(m,g)1\text{co-occur}(m, g) \geq 1 in both the ground-truth training set and test set.

We extend Zhao et al.'s1 bias score to a multi-attribute setting such that the bias score of a set of attributes mMm\in\mathcal{M} with respect to group gGg\in\mathcal{G} is defined as: biastrain(m,g)=co-occur(m,g)gGco-occur(m,g),\text{bias}_{\text{train}}(m, g) = \dfrac{\text{co-occur}(m, g)}{\displaystyle\sum_{g^\prime \in \mathcal{G}} \text{co-occur}(m, g^\prime)}, where co-occur(m,g)\text{co-occur}(m, g) denotes the number of times mm and gg co-occur in the training set.

Undirected multi-attribute bias amplification metric

Our proposed undirected multi-attribute bias amplification metric extends Zhao et al.'s1 single-attribute metric such that bias amplification from multiple attributes is measured: MultiMALS=X,variance(Δgm)\text{Multi}_\text{MALS} = X, \text{variance}(\Delta_{gm}) where X=1MgGmMΔgm X = \frac{1}{\lvert \mathcal{M} \rvert}\sum_{g\in \mathcal{G} } \sum_{m\in \mathcal{M}} \left\lvert\Delta_{gm}\right\rvert and Δgm=1[biastrain(m,g)>G1](biastest(m,g)biastrain(m,g)). \begin{aligned} \Delta_{gm} &= \mathbf{1}\left[ \text{bias}_{\text{train}}(m, g) > \lvert \mathcal{G}\rvert^{-1} \right] \cdot \\ &\left( \text{bias}_{\text{test}}(m, g) - \text{bias}_{\text{train}}(m, g)\right). \end{aligned} Here 1[]\mathbf{1} [\cdot] and biaspred(m,g)\text{bias}_{\text{pred}}(m, g) denote an indicator function and the bias score using the test set predictions of mm and gg, respectively.

MultiMALS\text{Multi}_\text{MALS} measures both the mean and variance over the change in bias score from the training set ground truths to test set predictions. By definition, MultiMALS\text{Multi}_\text{MALS} only captures group membership labels that are positively correlated with a set of attributes, i.e., due to the constraint that biastrain(m,g)>G1.\text{bias}_{\text{train}}(m,g) > \lvert \mathcal{G}\rvert^{-1}.

Directional multi-attribute bias amplification metric

Let m^\hat{m} and g^\hat{g} denote a model's prediction for attribute group, mm, and group membership, gg, respectively. Our proposed directional multi-attribute bias amplification metric extends Wang and Russakovsky's4 single-attribute metric such that bias amplification from multiple attributes is measured: Multi=X,variance(Δmg) \text{Multi}_{\rightarrow} = X, \text{variance}(\Delta_{mg}) where X=1GMgGmMygmΔgm+(1ygm)Δgm, \begin{aligned} X &=\frac{1}{\lvert \mathcal{G} \rvert \lvert \mathcal{M} \rvert} \sum_{g\in \mathcal{G}}\sum_{m\in \mathcal{M}} y_{gm}\left\lvert\Delta_{gm}\right\rvert+(1-y_{gm})\left\lvert-\Delta_{gm}\right\rvert, \end{aligned} ygm=1[Ptrain(g=1,m=1)>Ptrain(g=1)Ptrain(m=1)], \begin{aligned} y_{gm}=\mathbf{1} [ &P_{\text{train}}(g=1, m=1) > P_{\text{train}}(g=1) P_{\text{train}}(m=1)], \end{aligned} and Δgm={Ptest(m^=1g=1)Ptrain(m=1g=1)if measuring GMPtest(g^=1m=1)Ptrain(g=1m=1)if measuring MG \begin{aligned} \Delta_{gm} &= \begin{cases} P_{\text{test}}(\hat{m}=1 \vert g=1) - P_{\text{train}}(m=1 \vert g=1) \\ \text{if measuring }G\rightarrow M\\ P_{\text{test}}(\hat{g}=1 \vert m=1) - P_{\text{train}}(g=1 \vert m=1) \\ \text{if measuring }M\rightarrow G \end{cases} \end{aligned} Unlike MultiMALS\text{Multi}_\text{MALS}, Multi\text{Multi}_{\rightarrow} captures both positive and negative correlations, i.e., Multi\text{Multi}_{\rightarrow} iterates over all gGg\in\mathcal{G} regardless of whether biastrain(m,g)>G1\text{bias}_{\text{train}}(m, g) > \lvert \mathcal{G}\rvert^{-1}. Moreover, Multi\text{Multi}_{\rightarrow} takes into account the base rates for group membership and disentangles bias amplification arising from the group influencing the attribute(s) prediction (MultiGM\text{Multi}_{G \rightarrow M}), as well as bias amplification from the attribute(s) influencing the group prediction (MultiMG\text{Multi}_{M \rightarrow G}).

Comparison with existing metrics

Our proposed metrics have three advantages over existing single-attribute co-occurence-based metrics.

(Advantage 1) Our metrics account for co-occurrences with multiple attributes

If a model, for example, learns the combination of a1a_1 and a2a_2, i.e., m={a1,a2}Mm=\{a_1, a_2\}\in\mathcal{M}, are correlated with gGg\in\mathcal{G}, it can exploit this correlation, potentially leading to bias amplification. By iterating over all mM,m \in \mathcal{M}, our proposed metric accounts for amplification from single and multiple attributes. Thus, capturing sets of attributes exhibiting amplification, which are not accounted for by existing metrics.

(Advantage 2) Negative and positive values do not cancel each other out

Existing metrics calculate bias amplification by aggregating over the difference in bias scores for each attribute individually. Suppose there is a dataset with two annotated attributes a1a_1 and a2a_2. It is possible that Δga1Δga2{\Delta_{ga_1} \approx -\Delta_{ga_2}}. Since our metrics use absolute values, we ensure that positive and negative bias amplifications per attribute do not cancel each other out.

(Advantage 3) Our metrics are more interpretable

There is a lack of intuition as to the “ideal” bias amplification value. One interpretation is that smaller values are more desirable. This becomes less clear when values are negative, as occurs in several bias mitigation works10, 12. Negative bias amplification indicates bias in the predictions is in the opposite direction than that in the training set. However, this is not always ideal. First, there often exists a trade-off between performance and smaller bias amplification values. Second, high magnitude negative bias amplification may lead to erasure of certain groups. For example, in imSitu, biastrain({typing},female)=0.52.\text{bias}_{\text{train}}(\{\text{typing}\}, \text{female}) = \text{0.52}. Negative bias amplification signifies the model underpredicts ({typing},female)(\{\text{typing}\}, \text{female}), which could reinforce negative gender stereotypes1.

Instead, we may want to minimize the distance between the bias amplification value and zero. This interpretation offers the advantage that large negative values are also not desirable. However, a potential dilemma occurs when interpreting two values with the same magnitude but opposite signs, which is a value-laden decision and depends on the system's context. Additionally, under this alternative interpretation, Advantage 2 becomes more pressing as this suggests we are interpreting models as less biased than they are in practice.

Our proposed metrics are easy to interpret. Since we use absolute differences, the ideal value is unambiguously zero. Further, reporting variance provides intuition as to whether amplification is uniform across all attribute-group pairs or if particular pairs are more amplified.


  1. J. Zhao, T. Wang, M. Yatskar, V. Ordonez, and K.-W. Chang, “Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints,” in EMNLP, 2017.
  2. T. Wang, J. Zhao, M. Yatskar, K.-W. Chang, and V. Ordonez, “Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations,” in ICCV, 2019.
  3. Y. Hirota, Y. Nakashima, and N. Garcia, “Quantifying Societal Bias Amplification in Image Captioning,” in CVPR, 2022.
  4. A. Wang and O. Russakovsky, “Directional Bias Amplification,” in ICML, 2021.
  5. T.-Y. Lin et al., “Microsoft COCO: Common Objects in Context,” in ECCV, 2014.
  6. M. Yatskar, L. Zettlemoyer, and A. Farhadi, “Situation recognition: Visual semantic role labeling for image understanding,” in CVPR, 2016.
  7. Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep Learning Face Attributes in the Wild,” in ICCV, 2015.
  8. Z. Li et al., “A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others,” in CVPR, 2023.
  9. Z. Li, A. Hoogs, and C. Xu, “Discover and Mitigate Unknown Biases with Debiasing Alternate Networks,” in ECCV, 2022.
  10. Z. Wang et al., “Towards Fairness in Visual Recognition: Effective Strategies for Bias Mitigation,” in CVPR, 2020.
  11. S. Agarwal, S. Muku, S. Anand, and C. Arora, “Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To Reduce Model Bias,” in WACV, 2022.
  12. V. V. Ramaswamy, S. S. Y. Kim, and O. Russakovsky, “Fair Attribute Classification through Latent Space De-biasing,” in CVPR, 2021.


This work was funded by Sony Research Inc. We thank William Thong and Julienne LaChance for their helpful comments and suggestions.