Characteristics that make up human identity have become increasingly embedded into technological systems. Human characteristics like age, gender, race, and sexuality are being folded into the categorical structures of automated systems, such as algorithmic computer vision methods. However, these characteristics are often complex, nuanced, and fluid–and linked to social and historical instances of bias and discrimination. The simple and discrete categorization of these characteristics leads to tensions that can clash with human values and identity, and result in risky ramifications for already marginalized populations.
To mitigate the potential risks of these types of technological methods, we are researching ways to appropriately develop algorithms that are sensitive to the nuanced human identities held and expressed by the people classified. Our aim is to inform design approaches that are empowering and safe for all users.
Anthony Pinter, Katie Gach, Aaron Jiang, Morgan Klaus Scheuerman, Jed Brubaker
@article{scheuermanDataWorkers2025,
author = {Scheuerman, Morgan Klaus and Woodruff, Allison and Brubaker, Jed R.},
title = {How Data Workers Shape Datasets: The Role of Positionality in Data Collection and Annotation for Computer Vision},
year = {2025},
journal = {Proceedings of the ACM on Human-Computer Interaction},
volume = {9},
number = {CSCW},
articleno = {300},
doi = {10.1145/3757481},
url = {https://doi.org/10.1145/3757481},
keywords = {data work, positionality, computer vision, machine learning, annotation},
tags = {positionalML, identity-and-algorithms}
}
Data workers play a key role in the big data industry. Clients hire data workers to collect and annotate data with human identity concepts, like demographic categories or clothing items. Through interviews and ethnographic observations of data workers, we show how worker positionality influences decisions during data work. We propose positional (il)legibility as an approach to data work that embraces the reality of positionality in classification practices.
@article{scheuermanTransphobiaLLM2025,
author = {Scheuerman, Morgan Klaus and Weathington, Katy and Petterson, Adrian and Doyle, Dylan Thomas and Das, Dipto and DeVito, Michael Ann and Brubaker, Jed R.},
title = {Transphobia Is in the Eye of the Prompter: Trans-Centered Perspectives on Large Language Models},
year = {2025},
journal = {ACM Transactions on Computer-Human Interaction},
volume = {32},
number = {5},
articleno = {52},
numpages = {42},
doi = {10.1145/3743676},
url = {https://doi.org/10.1145/3743676},
keywords = {LLMs, transgender, bias, transphobia, generative AI},
tags = {transGPT, identity-and-algorithms}
}
Large language models (LLMs) are rapidly being integrated into products and services, often in chatbots. LLM-powered chatbots are expected to respond to any number of topics, including topics central to gender identity. In light of rising anti-trans discourse, we examined how two popular LLMs responded to real-world English-language questions about trans identity taken from Quora. We employed reflexive analysis that centered our situated knowledges of the trans community. We found that LLMs return pro-trans responses, even when presented with highly transphobic user prompts. While we also found highly transphobic LLM responses, anti-trans sentiment in LLMs was often subtle, requiring a deep positional understanding from diverse trans stakeholders to interpret.
@incollection{pinterConservateur2025,
author = {Pinter, Anthony T. and Margaret, Annie and Brubaker, Jed R.},
title = {Conservateur of a Former Self: Algorithmic Curation, Identity Exhibition, and Digital Well-Being},
year = {2025},
booktitle = {Oxford Intersections: Social Media in Society and Culture},
publisher = {Oxford University Press},
editor = {Khan, Laeeq},
doi = {10.1093/9780198945253.003.0113},
url = {https://doi.org/10.1093/9780198945253.003.0113},
keywords = {online identity, algorithms, self-presentation, life transitions, relationship dissolution, death},
tags = {identity-and-algorithms, breakups}
}
This research explores how people do identity in online spaces, building on Goffman’s dramaturgical approach and Hogan’s exhibitional approach. We introduce the conservateur role to describe how people make choices about their identity collection and what data the algorithmic curator can access to create identity exhibitions.
@inproceedings{scheuerman2024,
author = {Scheuerman, Morgan Klaus and Brubaker, Jed R.},
title = {Products of Positionality: How Tech Workers Shape Identity Concepts in Computer Vision},
year = {2024},
isbn = {9798400703300},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3613904.3641890},
doi = {10.1145/3613904.3641890},
booktitle = {Proceedings of the CHI Conference on Human Factors in Computing Systems},
articleno = {762},
numpages = {18},
keywords = {Tech work, computer vision, identity, machine learning, positionality, work studies},
location = {, Honolulu , HI , USA , },
series = {CHI '24},
tags = {identity-and-algorithms}
}
There has been a great deal of scholarly attention on issues of identity-related bias in machine learning. Much of this attention has focused on data and data workers, workers who do annotation tasks. Yet tech workers—like engineers, data scientists, and researchers—introduce their own “biases” when defining “identity” concepts. More specifically, they instill their own positionalities, the way they understand and are shaped by the world around them. Through interviews with industry tech workers who focus on computer vision, we show how workers embed their own positional perspectives into products and how positional gaps can lead to unforeseen and undesirable outcomes. We discuss how worker positionality is mutually shaped by the contexts in which they are embedded. We provide implications for researchers and practitioners to engage with the positionalities of tech workers, as well as those in contexts outside of development that influence tech workers.
@inproceedings{das2024,
author = {Das, Dipto and Guha, Shion and Brubaker, Jed R. and Semaan, Bryan},
title = {The ``Colonial Impulse" of Natural Language Processing: An Audit of Bengali Sentiment Analysis Tools and Their Identity-based Biases},
year = {2024},
isbn = {9798400703300},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3613904.3642669},
doi = {10.1145/3613904.3642669},
booktitle = {Proceedings of the CHI Conference on Human Factors in Computing Systems},
articleno = {769},
numpages = {18},
keywords = {Algorithmic audit, Bias, Colonial, Identity, Sentiment analysis tools},
location = {, Honolulu , HI , USA , },
series = {CHI '24},
tags = {identity-and-algorithms}
}
While colonization has sociohistorically impacted people’s identities across various dimensions, those colonial values and biases continue to be perpetuated by sociotechnical systems. One category of sociotechnical systems–sentiment analysis tools–can also perpetuate colonial values and bias, yet less attention has been paid to how such tools may be complicit in perpetuating coloniality, although they are often used to guide various practices (e.g., content moderation). In this paper, we explore potential bias in sentiment analysis tools in the context of Bengali communities who have experienced and continue to experience the impacts of colonialism. Drawing on identity categories most impacted by colonialism amongst local Bengali communities, we focused our analytic attention on gender, religion, and nationality. We conducted an algorithmic audit of all sentiment analysis tools for Bengali, available on the Python package index (PyPI) and GitHub. Despite similar semantic content and structure, our analyses showed that in addition to inconsistencies in output from different tools, Bengali sentiment analysis tools exhibit bias between different identity categories and respond differently to different ways of identity expression. Connecting our findings with colonially shaped sociocultural structures of Bengali communities, we discuss the implications of downstream bias of sentiment analysis tools.
@article{Scheuerman2023a,
author = {Scheuerman, Morgan Klaus and Weathington, Katy and Mugunthan, Tarun and Denton, Emily and Fiesler, Casey},
title = {From Human to Data to Dataset: Mapping the Traceability of Human Subjects in Computer Vision Datasets},
year = {2023},
issue_date = {April 2023},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {7},
number = {CSCW1},
url = {https://doi.org/10.1145/3579488},
doi = {10.1145/3579488},
journal = {Proc. ACM Hum.-Comput. Interact.},
month = apr,
articleno = {55},
numpages = {33},
keywords = {machine learning, datasets, data subjects, data ethics, computer vision},
tags = {identity-and-algorithms}
}
Computer vision is a "data hungry" field. Researchers and practitioners who work on human-centric computer vision, like facial recognition, emphasize the necessity of vast amounts of data for more robust and accurate models. Humans are seen as a data resource which can be converted into datasets. The necessity of data has led to a proliferation of gathering data from easily available sources, including "public" data from the web. Yet the use of public data has significant ethical implications for the human subjects in datasets. We bridge academic conversations on the ethics of using publicly obtained data with concerns about privacy and agency associated with computer vision applications. Specifically, we examine how practices of dataset construction from public data-not only from websites, but also from public settings and public records-make it extremely difficult for human subjects to trace their images as they are collected, converted into datasets, distributed for use, and, in some cases, retracted. We discuss two interconnected barriers current data practices present to providing an ethics of traceability for human subjects: awareness and control. We conclude with key intervention points for enabling traceability for data subjects. We also offer suggestions for an improved ethics of traceability to enable both awareness and control for individual subjects in dataset curation practices.
@inproceedings{Katzman2023,
author = {Katzman, Jared and Wang, Angelina and Scheuerman, Morgan Klaus and Blodgett, Su Lin and Laird, Kristen and Wallach, Hanna and Barocas, Solon},
booktitle = {AAAI},
title = {{Taxonomizing and Measuring Representational Harms: A Look at Image Tagging}},
year = {2023},
doi = {10.1609/aaai.v37i12.26670},
url = {https://doi.org/10.1609/aaai.v37i12.26670},
tags = {identity-and-algorithms}
}
@article{Scheuerman2021-bigdata-autoessentalization,
title = {Auto-essentialization: Gender in automated facial analysis as extended colonial project},
author = {Scheuerman, Morgan Klaus and Pape, Madeleine and Hanna, Alex},
journal = {Big Data \& Society},
volume = {8},
number = {2},
pages = {20539517211053712},
year = {2021},
publisher = {SAGE Publications Sage UK: London, England},
doi = {10.1177/20539517211053712},
url = {https://doi.org/10.1177/20539517211053712},
tags = {identity-and-algorithms}
}
@article{Scheuerman2020-cscw-databaseidentity,
title = {How {We’ve} {Taught} {Algorithms} to {See} {Identity}: {Constructing} {Race} and {Gender} in {Image} {Databases} for {Facial} {Analysis}},
author = {Scheuerman, Morgan Klaus and Wade, Kandrea and Lustig, Caitlin and Brubaker, Jed R.},
doi = {10.1145/3392866},
journal = {Proc. ACM Hum.-Comput. Interact.},
number = {CSCW1},
pages = {Article 58},
volume = {4},
year = {2020},
tags = {identity-and-algorithms, marginalization-and-safety, positionalML},
annote = {Best Paper Award},
note = {Best Paper Honorable Mention}
}
@article{Scheuerman2019-cscw-gender,
title = {How {Computers} {See} {Gender}: {An} {Evaluation} of {Gender} {Classification} in {Commercial} {Facial} {Analysis} and {Image} {Labeling} {Services}},
author = {Scheuerman, Morgan Klaus and Paul, Jacob M and Brubaker, Jed R.},
doi = {10.1145/3359246},
journal = {Proc. ACM Hum.-Comput. Interact.},
number = {CSCW},
pages = {Article 144},
volume = {3},
year = {2019},
tags = {marginalization-and-safety, identity-and-algorithms}
}
@article{Pinter2019-upsetcon,
title = {"Am {I} {Never} {Going} to {Be} {Free} of {All} {This} {Crap}?": {Upsetting} {Encounters} {With} {Algorithmically} {Curated} {Content} {About} {Ex-Partners}},
volume = {3},
number = {CSCW},
journal = {Proc. ACM Hum.-Comput. Interact.},
author = {Pinter, Anthony T. and Jiang, Jialun "Aaron" and Gach, Katie Z. and Sidwell, Melanie M. and Dykes, James E. and Brubaker, Jed R.},
year = {2019},
pages = {Article 70},
doi = {10.1145/3359172},
tags = {identity-and-algorithms, breakups}
}
@inproceedings{Scheuerman2018a,
title = {Gender is not a {Boolean}: {Towards} {Designing} {Algorithms} to {Understand} {Complex} {Human} {Identities}},
author = {Scheuerman, Morgan Klaus and Brubaker, Jed R.},
booktitle = {In Participation+Algorithms Workshop at CSCW 2018.},
year = {2018},
tags = {identity-and-algorithms}
}
Algorithmic methods are increasingly used to identify and categorize human characteristics. A range of human identities, such as gender, race, and sexual orientation, are becoming interwoven with systems. We discuss the case of automatic gender recognition technologies that algorithmically assign binary gender categories. Based on our previous work with transgender participants, we discuss the ways current gender recognition systems misrepresent complex gender identities and undermine safety. We describe plans to build on this by conducting participatory design workshops with designers and potential users to develop improved methods for conceptualizing gender identity in algorithms.
@inproceedings{Hamidi2018,
author = {Hamidi, Foad and Scheuerman, Morgan Klaus and Branham, Stacy M.},
title = {Gender Recognition or Gender Reductionism?: The Social Implications of Embedded Gender Recognition Systems},
booktitle = {Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems},
series = {CHI '18},
year = {2018},
isbn = {978-1-4503-5620-6},
location = {Montreal QC, Canada},
pages = {8:1--8:13},
articleno = {8},
numpages = {13},
url = {http://doi.acm.org/10.1145/3173574.3173582},
doi = {10.1145/3173574.3173582},
acmid = {3173582},
publisher = {ACM},
address = {New York, NY, USA},
tags = {identity-and-algorithms}
}
Automatic Gender Recognition (AGR) refers to various computational methods that aim to identify an individual’s gender by extracting and analyzing features from images, video, and/or audio. Applications of AGR are increasingly being explored in domains such as security, marketing, and social robotics. However, little is known about stakeholders’ perceptions and attitudes towards AGR and how this technology might disproportionately affect vulnerable communities. To begin to address these gaps, we interviewed 13 transgender individuals, including three transgender technology designers, about their perceptions and attitudes towards AGR. We found that transgender individuals have overwhelmingly negative attitudes towards AGR and fundamentally question whether it can accurately recognize such a subjective aspect of their identity. They raised concerns about privacy and potential harms that can result from being incorrectly gendered, or misgendered, by technology. We present a series of recommendations on how to accommodate gender diversity when designing new digital systems.
@article{pinterWorkingErasingYou2024,
title = {I'm {{Working}} on {{Erasing You}}, {{Just Don}}'t {{Have}} the {{Proper Tools}}: {{Supporting Online Identity Management After}} the {{End}} of {{Romantic Relationships}}},
shorttitle = {I'm {{Working}} on {{Erasing You}}, {{Just Don}}'t {{Have}} the {{Proper Tools}}},
author = {Pinter, Anthony T. and Brubaker, Jed R.},
date = {2024-04-26},
journaltitle = {Proceedings of the ACM on Human-Computer Interaction},
shortjournal = {Proc. ACM Hum.-Comput. Interact.},
volume = {8},
pages = {66:1--66:32},
doi = {10.1145/3637343},
issue = {CSCW1},
keywords = {digital identity,empirical work,journal,life transitions,relationship dissolution,social media},
tags = {identity-and-algorithms, breakups}
}
After a break-up, people are left with data representative of their lost relationship - pictures, posts, and connections that exist because of that relationship. As part of breaking up and moving on, people often make decisions about managing that data. Prior work has identified two broad types of curatorial philosophies people adopt in data management: archivists and revisionists. However, what drives individuals to one approach remains unknown and is difficult to design sociotechnical systems for. Through focus group interviews with couples still together, we present a decision-making framework for data management. We outline factors that can influence individuals’ decision to act as an archivist or revisionist in the wake of a break-up. From our data and framework, we identify six implications for design to improve user experiences in the wake of a break-up, and from those implications, offer concrete suggestions for design for social media platforms.
@article{baumerAlgorithmicSubjectivities2024,
title = {Algorithmic Subjectivities},
author = {Baumer, Eric P. S. and Taylor, Alex S. and Brubaker, Jed R. and McGee, Micki},
date = {2024-04-27},
journaltitle = {ACM Transactions on Computer-Human Interaction},
shortjournal = {ACM Trans. Comput.-Hum. Interact.},
doi = {10.1145/3660344},
keywords = {algorithms,journal,reflective HCI,Subjectivity},
tags = {post-userism, identity-and-algorithms}
}
This paper considers how subjectivities are enlivened in algorithmic systems. We first review related literature to clarify how we see "subjectivities" as emerging through a tangled web of processes and actors. We then offer two case studies exemplifying the emergence of algorithmic subjectivities: one involving computational topic modeling of blogs written by parents with children on the autism spectrum, and one involving algorithmic moderation of social media content. Drawing on these case studies, we then articulate a series of qualities that characterizes algorithmic subjectivities. We also compare and contrast these qualities with a number of related concepts from prior literature to articulate how algorithmic subjectivities constitutes a novel theoretical contribution, as well as how it offers a focal lens for future empirical investigation and for design. In short, this paper points out how certain worlds are being made and/or being made possible via algorithmic systems, and it asks HCI to consider what other worlds might be possible.
How We’ve Taught Algorithms to See Identity
Break-ups Suck. They Could Suck Less.
Even after blocking an ex on Facebook, the platform promotes painful reminders
How social media makes breakups that much worse The Problem With Putting Social Media in Charge of Our Memories