llm-ification

Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review

Welcome to the github repository for the paper “Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review” 📄 [Preprint].

🌀 Abstract

Large language models (LLMs) have been positioned to revolutionize HCI, by reshaping not only the interfaces, design patterns, and sociotechnical systems that we study, but also the research practices we use. To-date, however, there has been little understanding of LLMs’ uptake in HCI. We address this gap via a systematic literature review of 153 CHI papers from 2020-24 that engage with LLMs. We taxonomize: (1) domains where LLMs are applied; (2) roles of LLMs in HCI projects; (3) contribution types; and (4) acknowledged limitations and risks. We find LLM work in 10 diverse domains, primarily via empirical and artifact contributions. Authors use LLMs in five distinct roles, including as research tools or simulated users. Still, authors often raise validity and reproducibility concerns, and overwhelmingly study closed models. We outline opportunities to improve HCI research with and on LLMs, and provide guiding questions for researchers to consider the validity and appropriateness of LLM-related work.

🏺 Taxonomy

📋 Application Domains, LLM Roles, Limitations & Risks 📋

This refers to Table 1 in the paper. The table of the main taxonomy that we included. Domains where LLM applications are developed, roles of LLMs in HCI projects, and acknowledged risks and limitations. Note that we did not include contribution types in this table. A paper can have multiple (sub-)codes.

🎭 LLM Roles throughout common HCI research stages 🎭

An overview of roles that LLMs can play throughout common HCI research stages. A flowchart that shows an overview of different roles that LLMs can play throughout common HCI research stages. The stages are Research Questions, Systems, Users, Analyses, Publication & Deployment. There are three colored arrows: the first is green, and it connects from left to right all the five stages. This denotes system-building studies. The second is an orange arrow that connects the Research Question, Users, Analyses, and Publication & Deployment. This indicates user studies without building systems. The third is a purple arrow that connects Research Questions, Analyses, and Publication & Deployment. This indicates data science studies.

🌐 Paper Collections

Our analysis will evolve as we collect and analyze papers at CHI and more publication venues. If we miss your papers, please feel free to submit a pull request, open an issue, or ✉️ email us! We’d love to include your work and together we can make this collection more comprehensive.

Title Year
Studying the Effects of Cognitive Biases in Evaluation of Conversational Agents 2020
Mirror Ritual: An Affective Interface for Emotional Self-Reflection 2020
The Impact of Multiple Parallel Phrase Suggestions on Email Input and Composition Behaviour of Native and Non-Native English Writers 2021
Directed Diversity: Leveraging Language Embedding Distances for Collective Creativity in Crowd Ideation 2021
Stylette: Styling the Web with Natural Language 2022
Pretty Princess vs. Successful Leader: Gender Roles in Greeting Card Messages 2022
Will AI Console Me when I Lose my Pet? Understanding Perceptions of AI-Mediated Email Writing 2022
TapType: Ten-finger text entry on everyday surfaces via Bayesian inference 2022
TypeAnywhere: A QWERTY-Based Text Entry Solution for Ubiquitous Computing 2022
Discovering the Syntax and Strategies of Natural Language Programming with Generative Language Models 2022
TaleBrush: Sketching Stories with Generative Pretrained Language Models 2022
CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities 2022
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts 2022
Co-Writing with Opinionated Language Models Affects Users’ Views. 2023
Evaluating Large Language Models in Generating Synthetic HCI Research Data: a Case Study 2023
Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts. 2023
Comparing Sentence-Level Suggestions to Message-Level Suggestions in AI-Mediated Communication 2023
Synthetic Lies: Understanding AI-Generated Misinformation and Evaluating Algorithmic and Human Solutions 2023
Social Dynamics of AI Support in Creative Writing 2023
Designerly Understanding: Information Needs for Model Transparency to Support Design Ideation for AI-Powered User Experience 2023
Visual Captions: Augmenting Verbal Communication with On-the-fly Visuals 2023
Enabling Conversational Interaction with Mobile UI using Large Language Models 2023
Personalized Quest and Dialogue Generation in Role-Playing Games: A Knowledge Graph- and Language Model-based Approach 2023
PopBlends: Strategies for Conceptual Blending with Large Language Models 2023
“The less I type, the better”: How AI Language Models can Enhance or Impede Communication for AAC Users. 2023
A Mixed-Methods Approach to Understanding User Trust after Voice Assistant Failures 2023
Choice Over Control: How Users Write with Large Language Models using Diegetic and Non-Diegetic Prompting 2023
Do You Mind? User Perceptions of Machine Consciousness 2023
Moral Framing of Mental Health Discourse and Its Relationship to Stigma: A Comparison of Social Media and News 2023
AngleKindling: Supporting Journalistic Angle Ideation with Large Language Models 2023
“What It Wants Me To Say”: Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models. 2023
Co-Writing Screenplays and Theatre Scripts with Language Models: Evaluation by Industry Professionals 2023
Embodying the Algorithm: Exploring Relationships with Large Language Models Through Artistic Performance 2023
Harnessing Biomedical Literature to Calibrate Clinicians’ Trust in AI Decision Support Systems. 2023
Understanding the Benefits and Challenges of Deploying Conversational AI Leveraging Large Language Models for Public Health Intervention 2023
RELIC: Investigating Large Language Model Responses using Self-Consistency 2024
Generating Automatic Feedback on UI Mockups with Large Language Models 2024
MindShift: Leveraging Large Language Models for Mental-States-Based Problematic Smartphone Use Intervention 2024
Memoro: Using Large Language Models to Realize a Concise Interface for Real-Time Memory Augmentation 2024
Unraveling the Dilemma of AI Errors: Exploring the Effectiveness of Human and Machine Explanations for Large Language Models 2024
HILL: A Hallucination Identifier for Large Language Models 2024
A Canary in the AI Coal Mine: American Jews May Be Disproportionately Harmed by Intellectual Property Dispossession in Large Language Model Training 2024
Designing Scaffolding Strategies for Conversational Agents in Dialog Task of Neurocognitive Disorders Screening 2024
SimUser: Generating Usability Feedback by Simulating Various Users Interacting with Mobile Applications 2024
Natural Language Dataset Generation Framework for Visualizations Powered by Large Language Models 2024
Jigsaw: Supporting Designers to Prototype Multimodal Applications by Chaining AI Foundation Models 2024
CodeAid: Evaluating a Classroom Deployment of an LLM-based Programming Assistant that Balances Student and Educator Needs 2024
How Do Analysts Understand and Verify AI-Assisted Data Analyses? 2024
From Paper to Card: Transforming Design Implications with Generative AI 2024
Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of ChatGPT Answers to Stack Overflow Questions 2024
Facilitating Self-Guided Mental Health Interventions Through Human-Language Model Interaction: A Case Study of Cognitive Restructuring 2024
VirtuWander: Enhancing Multi-modal Interaction for Virtual Tour Guidance through Large Language Models 2024
Unlock Life with a Chat(GPT): Integrating Conversational AI with Large Language Models into Everyday Lives of Autistic Individuals 2024
Enhancing UX Evaluation Through Collaboration with Conversational AI Assistants: Effects of Proactive Dialogue and Timing 2024
ABScribe: Rapid Exploration & Organization of Multiple Writing Variations in Human-AI Co-Writing Tasks using Large Language Models 2024
Know Your Audience: The benefits and pitfalls of generating plain language summaries beyond the “general” audience 2024
Farsight: Fostering Responsible AI Awareness During AI Application Prototyping 2024
Evaluating the Experience of LGBTQ+ People Using Large Language Model Based Chatbots for Mental Health Support 2024
ContextCam: Bridging Context Awareness with Creative Human-AI Image Co-Creation 2024
Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation 2024
Deus Ex Machina and Personas from Large Language Models: Investigating the Composition of AI-Generated Persona Descriptions 2024
ReactGenie: A Development Framework for Complex Multimodal Interactions Using Large Language Models 2024
See Widely, Think Wisely: Toward Designing a Generative Multi-agent System to Burst Filter Bubbles 2024
Ivie: Lightweight Anchored Explanations of Just-Generated Code 2024
Empowering Calibrated (Dis-)Trust in Conversational Agents: A User Study on the Persuasive Power of Limitation Disclaimers vs. Authoritative Style 2024
Mathemyths: Leveraging Large Language Models to Teach Mathematical Language through Child-AI Co-Creative Storytelling 2024
How AI Processing Delays Foster Creativity: Exploring Research Question Co-Creation with an LLM-based Agent 2024
OmniActions: Predicting Digital Actions in Response to Real-World Multimodal Sensory Inputs with LLMs 2024
AXNav: Replaying Accessibility Tests from Natural Language 2024
Human-LLM Collaborative Annotation Through Effective Verification of LLM Labels 2024
Reducing the Search Space on demand helps Older Adults find Mobile UI Features quickly, on par with Younger Adults 2024
An Empathy-Based Sandbox Approach to Bridge the Privacy Gap among Attitudes, Goals, Knowledge, and Behaviors 2024
C2Ideas: Supporting Creative Interior Color Design Ideation with a Large Language Model 2024
Think Fast, Think Slow, Think Critical: Designing an Automated Propaganda Detection Tool 2024
Understanding the Impact of Long-Term Memory on Self-Disclosure with Large Language Model-Driven Chatbots for Public Health Intervention 2024
Evaluating Large Language Models on Academic Literature Understanding and Review: An Empirical Study among Early-stage Scholars 2024
Scientific and Fantastical: Creating Immersive, Culturally Relevant Learning Experiences with Augmented Reality and Large Language Models 2024
Marco: Supporting Business Document Workflows via Collection-Centric Information Foraging with Large Language Models 2024
Leveraging Prompt-Based Large Language Models: Predicting Pandemic Health Decisions and Outcomes Through Social Media Language 2024
“As an AI language model, I cannot”: Investigating LLM Denials of User Requests 2024
Teachers, Parents, and Students’ perspectives on Integrating Generative AI into Elementary Literacy Education 2024
The HaLLMark Effect: Supporting Provenance and Transparent Use of Large Language Models in Writing with Interactive Visualization 2024
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM 2024
CloChat: Understanding How People Customize, Interact, and Experience Personas in Large Language Models 2024
Understanding the Role of Large Language Models in Personalizing and Scaffolding Strategies to Combat Academic Procrastination 2024
ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-12 2024
MindTalker: Navigating the Complexities of AI-Enhanced Social Engagement for People with Early-Stage Dementia 2024
A Piece of Theatre: Investigating How Teachers Design LLM Chatbots to Assist Adolescent Cyberbullying Education 2024
MindfulDiary: Harnessing Large Language Model to Support Psychiatric Patients’ Journaling 2024
Art or Artifice? Large Language Models and the False Promise of Creativity 2024
PlantoGraphy: Incorporating Iterative Design Process into Generative Artificial Intelligence for Landscape Rendering 2024
Eternagram: Probing Player Attitudes Towards Climate Change Using a ChatGPT-driven Text-based Adventure 2024
The Illusion of Empathy? Notes on Displays of Emotion in Human-Computer Interaction 2024
Open Sesame? Open Salami! Personalizing Vocabulary Assessment-Intervention for Children via Pervasive Profiling and Bespoke Storybook Generation 2024
Silver-Tongued and Sundry: Exploring Intersectional Pronouns with ChatGPT 2024
Learning Agent-based Modeling with LLM Companions: Experiences of Novices and Experts Using ChatGPT & NetLogo Chat 2024
DirectGPT: A Direct Manipulation Interface to Interact with Large Language Models 2024
Human-Algorithmic Interaction Using a Large Language Model-Augmented Artificial Intelligence Clinical Decision Support System 2024
Automatic Macro Mining from Interaction Traces at Scale 2024
Towards Designing a Question-Answering Chatbot for Online News: Understanding Questions and Perspectives 2024
ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing 2024
MUD: Towards a Large-Scale and Noise-Filtered UI Dataset for Modern Style UI Modeling 2024
VIVID: Human-AI Collaborative Authoring of Vicarious Dialogues from Lecture Videos 2024
PaperWeaver: Enriching Topical Paper Alerts by Contextualizing Recommended Papers with User-collected Papers 2024
“If the Machine Is As Good As Me, Then What Use Am I?” - How the Use of ChatGPT Changes Young Professionals’ Perception of Productivity and Accomplishment 2024
CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models 2024
PANDALens: Towards AI-Assisted In-Context Writing on OHMD During Travels 2024
LLMR: Real-time Prompting of Interactive Worlds using Large Language Models 2024
Surgment: Segmentation-enabled Semantic Search and Creation of Visual Question and Feedback to Support Video-Based Surgery Learning 2024
Writer-Defined AI Personas for On-Demand Feedback Generation 2024
ChaCha: Leveraging Large Language Models to Prompt Children to Share Their Emotions about Personal Events 2024
Understanding Underground Incentivized Review Services 2024
Multimodal Healthcare AI: Identifying and Designing Clinically Relevant Vision-Language Applications for Radiology 2024
How Beginning Programmers and Code LLMs (Mis)read Each Other 2024
Selenite: Scaffolding Online Sensemaking with Comprehensive Overviews Elicited from Large Language Models 2024
Rambler: Supporting Writing With Speech via LLM-Assisted Gist Manipulation 2024
Teach AI How to Code: Using Large Language Models as Teachable Agents for Programming Education 2024
The Promise and Peril of ChatGPT in Higher Education: Opportunities, Challenges, and Design Implications 2024
Narrating Fitness: Leveraging Large Language Models for Reflective Fitness Tracker Data Interpretation 2024
Human I/O: Towards a Unified Approach to Detecting Situational Impairments 2024
VAL: Interactive Task Learning with GPT Dialog Parsing 2024
From Text to Self: Users’ Perception of AIMC Tools on Interpersonal Communication and Self 2024
More than Model Documentation: Uncovering Teachers’ Bespoke Information Needs for Informed Classroom Integration of ChatGPT 2024
Putting Things into Context: Generative AI-Enabled Context Personalization for Vocabulary Learning Improves Learning Motivation 2024
EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria 2024
AQuA: Automated Question-Answering in Software Tutorial Videos with Visual Anchors 2024
If in a Crowdsourced Data Annotation Pipeline, a GPT-4 2024
“It’s the only thing I can trust”: Envisioning Large Language Model Use by Autistic Workers for Communication Assistance. 2024
Designing Accessible Obfuscation Support for Blind Individuals’ Visual Privacy Management. 2024
Testing, Socializing, Exploring: Characterizing Middle Schoolers’ Approaches to and Conceptions of ChatGPT. 2024
DiaryMate: Understanding User Perceptions and Experience in Human-AI Collaboration for Personal Journaling. 2024
Under the (neighbor)hood: Hyperlocal Surveillance on Nextdoor. 2024
Rehearsal: Simulating Conflict to Teach Conflict Resolution. 2024
Debate Chatbots to Facilitate Critical Thinking on YouTube: Social Identity and Conversational Style Make A Difference. 2024
ClassMeta: Designing Interactive Virtual Classmate to Promote VR Classroom Participation. 2024
The Role of Inclusion, Control, and Ownership in Workplace AI-Mediated Communication. 2024
Co-Designing QuickPic: Automated Topic-Specific Communication Boards from Photographs for AAC-Based Language Instruction. 2024
Integrating Expertise in LLMs: Crafting a Customized Nutrition Assistant with Refined Template Instructions. 2024
Supporting Sensemaking of Large Language Model Outputs at Scale. 2024
Bridging the Gulf of Envisioning: Cognitive Challenges in Prompt Based Interactions with LLMs. 2024
The Value, Benefits, and Concerns of Generative AI-Powered Assistance in Writing. 2024
Shaping Human-AI Collaboration: Varied Scaffolding Levels in Co-writing with Language Models. 2024
Intelligent Support Engages Writers Through Relevant Cognitive Processes. 2024
Generative Echo Chamber? Effect of LLM-Powered Search Systems on Diverse Information Seeking. 2024
CoPrompt: Supporting Prompt Sharing and Referring in Collaborative Natural Language Programming. 2024
EmoEden: Applying Generative Artificial Intelligence to Emotional Learning for Children with High-Function Autism. 2024
BLIP: Facilitating the Exploration of Undesirable Consequences of Digital Technologies. 2024
AI-Driven Mediation Strategies for Audience Depolarisation in Online Debates. 2024
AI-Augmented Brainwriting: Investigating the use of LLMs in group ideation. 2024
“It’s a Fair Game”, or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents. 2024
Towards AI-Driven Healthcare: Systematic Optimization, Linguistic Analysis, and Clinicians’ Evaluation of Large Language Models for Smoking Cessation Interventions. 2024
Advancing Patient-Centered Shared Decision-Making with AI Systems for Older Adult Cancer Patients. 2024
A Design Space for Intelligent and Interactive Writing Assistants. 2024

Citation

If you find this useful in your research, please consider citing this paper:

@article{pang2025understanding,
  title={Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review},
  author={Rock Yuren Pang and Hope Schroeder and Kynnedy Simone Smith and Solon Barocas and Ziang Xiao and Emily Tseng and Danielle Bragg},
  journal={arXiv preprint arXiv:2501.12557},
  year={2025},
  url={https://arxiv.org/abs/2501.12557}
}

Acknowledgement

This work wouldn’t be possible without the research inspiration from my advisor at UW, Katharina Reinecke. We thank the anonymous reviewers for their valuable feedback. We also thank Jenn Wortman Vaughan, Kevin Feng, Mohammed Alsobay, Sachita Nishal, Harsh Kumar, Shivani Kapania, Katelyn Mei, Enhao Zhang, Sandy Kaplan and many more friends and mentors at the University of Washington and Microsoft Research for their research inspirations, fun conversations, and helpful suggestions.

Contact

I’m looking forward to understanding this line of work beyond CHI. If you have feedback for our paper, or are interested in chatting or collaborating, please don’t hesitate to contact: Rock Yuren Pang <ypang2@cs.washington.edu>