llm-ification

Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review

Welcome to the github repository for the paper “Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review” 📄 [Preprint].

🌀 Abstract

Large language models (LLMs) have been positioned to revolutionize HCI, by reshaping not only the interfaces, design patterns, and sociotechnical systems that we study, but also the research practices we use. To-date, however, there has been little understanding of LLMs’ uptake in HCI. We address this gap via a systematic literature review of 153 CHI papers from 2020-24 that engage with LLMs. We taxonomize: (1) domains where LLMs are applied; (2) roles of LLMs in HCI projects; (3) contribution types; and (4) acknowledged limitations and risks. We find LLM work in 10 diverse domains, primarily via empirical and artifact contributions. Authors use LLMs in five distinct roles, including as research tools or simulated users. Still, authors often raise validity and reproducibility concerns, and overwhelmingly study closed models. We outline opportunities to improve HCI research with and on LLMs, and provide guiding questions for researchers to consider the validity and appropriateness of LLM-related work.

🏺 Taxonomy

📋 Application Domains, LLM Roles, Limitations & Risks 📋

This refers to Table 1 in the paper. The table of the main taxonomy that we included. Domains where LLM applications are developed, roles of LLMs in HCI projects, and acknowledged risks and limitations. Note that we did not include contribution types in this table. A paper can have multiple (sub-)codes.

🎭 LLM Roles throughout common HCI research stages 🎭

🌐 Paper Collections

Our analysis will evolve as we collect and analyze papers at CHI and more publication venues. If we miss your papers, please feel free to submit a pull request, open an issue, or ✉️ email us! We’d love to include your work and together we can make this collection more comprehensive.

Title	Year
Studying the Effects of Cognitive Biases in Evaluation of Conversational Agents	2020
Mirror Ritual: An Affective Interface for Emotional Self-Reflection	2020
The Impact of Multiple Parallel Phrase Suggestions on Email Input and Composition Behaviour of Native and Non-Native English Writers	2021
Directed Diversity: Leveraging Language Embedding Distances for Collective Creativity in Crowd Ideation	2021
Stylette: Styling the Web with Natural Language	2022
Pretty Princess vs. Successful Leader: Gender Roles in Greeting Card Messages	2022
Will AI Console Me when I Lose my Pet? Understanding Perceptions of AI-Mediated Email Writing	2022
TapType: Ten-finger text entry on everyday surfaces via Bayesian inference	2022
TypeAnywhere: A QWERTY-Based Text Entry Solution for Ubiquitous Computing	2022
Discovering the Syntax and Strategies of Natural Language Programming with Generative Language Models	2022
TaleBrush: Sketching Stories with Generative Pretrained Language Models	2022
CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities	2022
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts	2022
Co-Writing with Opinionated Language Models Affects Users’ Views.	2023
Evaluating Large Language Models in Generating Synthetic HCI Research Data: a Case Study	2023
Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts.	2023
Comparing Sentence-Level Suggestions to Message-Level Suggestions in AI-Mediated Communication	2023
Synthetic Lies: Understanding AI-Generated Misinformation and Evaluating Algorithmic and Human Solutions	2023
Social Dynamics of AI Support in Creative Writing	2023
Designerly Understanding: Information Needs for Model Transparency to Support Design Ideation for AI-Powered User Experience	2023
Visual Captions: Augmenting Verbal Communication with On-the-fly Visuals	2023
Enabling Conversational Interaction with Mobile UI using Large Language Models	2023
Personalized Quest and Dialogue Generation in Role-Playing Games: A Knowledge Graph- and Language Model-based Approach	2023
PopBlends: Strategies for Conceptual Blending with Large Language Models	2023
“The less I type, the better”: How AI Language Models can Enhance or Impede Communication for AAC Users.	2023
A Mixed-Methods Approach to Understanding User Trust after Voice Assistant Failures	2023
Choice Over Control: How Users Write with Large Language Models using Diegetic and Non-Diegetic Prompting	2023
Do You Mind? User Perceptions of Machine Consciousness	2023
Moral Framing of Mental Health Discourse and Its Relationship to Stigma: A Comparison of Social Media and News	2023
AngleKindling: Supporting Journalistic Angle Ideation with Large Language Models	2023
“What It Wants Me To Say”: Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models.	2023
Co-Writing Screenplays and Theatre Scripts with Language Models: Evaluation by Industry Professionals	2023
Embodying the Algorithm: Exploring Relationships with Large Language Models Through Artistic Performance	2023
Harnessing Biomedical Literature to Calibrate Clinicians’ Trust in AI Decision Support Systems.	2023
Understanding the Benefits and Challenges of Deploying Conversational AI Leveraging Large Language Models for Public Health Intervention	2023
RELIC: Investigating Large Language Model Responses using Self-Consistency	2024
Generating Automatic Feedback on UI Mockups with Large Language Models	2024
MindShift: Leveraging Large Language Models for Mental-States-Based Problematic Smartphone Use Intervention	2024
Memoro: Using Large Language Models to Realize a Concise Interface for Real-Time Memory Augmentation	2024
Unraveling the Dilemma of AI Errors: Exploring the Effectiveness of Human and Machine Explanations for Large Language Models	2024
HILL: A Hallucination Identifier for Large Language Models	2024
A Canary in the AI Coal Mine: American Jews May Be Disproportionately Harmed by Intellectual Property Dispossession in Large Language Model Training	2024
Designing Scaffolding Strategies for Conversational Agents in Dialog Task of Neurocognitive Disorders Screening	2024
SimUser: Generating Usability Feedback by Simulating Various Users Interacting with Mobile Applications	2024
Natural Language Dataset Generation Framework for Visualizations Powered by Large Language Models	2024
Jigsaw: Supporting Designers to Prototype Multimodal Applications by Chaining AI Foundation Models	2024
CodeAid: Evaluating a Classroom Deployment of an LLM-based Programming Assistant that Balances Student and Educator Needs	2024
How Do Analysts Understand and Verify AI-Assisted Data Analyses?	2024
From Paper to Card: Transforming Design Implications with Generative AI	2024
Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of ChatGPT Answers to Stack Overflow Questions	2024
Facilitating Self-Guided Mental Health Interventions Through Human-Language Model Interaction: A Case Study of Cognitive Restructuring	2024
VirtuWander: Enhancing Multi-modal Interaction for Virtual Tour Guidance through Large Language Models	2024
Unlock Life with a Chat(GPT): Integrating Conversational AI with Large Language Models into Everyday Lives of Autistic Individuals	2024
Enhancing UX Evaluation Through Collaboration with Conversational AI Assistants: Effects of Proactive Dialogue and Timing	2024
ABScribe: Rapid Exploration & Organization of Multiple Writing Variations in Human-AI Co-Writing Tasks using Large Language Models	2024
Know Your Audience: The benefits and pitfalls of generating plain language summaries beyond the “general” audience	2024
Farsight: Fostering Responsible AI Awareness During AI Application Prototyping	2024
Evaluating the Experience of LGBTQ+ People Using Large Language Model Based Chatbots for Mental Health Support	2024
ContextCam: Bridging Context Awareness with Creative Human-AI Image Co-Creation	2024
Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation	2024
Deus Ex Machina and Personas from Large Language Models: Investigating the Composition of AI-Generated Persona Descriptions	2024
ReactGenie: A Development Framework for Complex Multimodal Interactions Using Large Language Models	2024
See Widely, Think Wisely: Toward Designing a Generative Multi-agent System to Burst Filter Bubbles	2024
Ivie: Lightweight Anchored Explanations of Just-Generated Code	2024
Empowering Calibrated (Dis-)Trust in Conversational Agents: A User Study on the Persuasive Power of Limitation Disclaimers vs. Authoritative Style	2024
Mathemyths: Leveraging Large Language Models to Teach Mathematical Language through Child-AI Co-Creative Storytelling	2024
How AI Processing Delays Foster Creativity: Exploring Research Question Co-Creation with an LLM-based Agent	2024
OmniActions: Predicting Digital Actions in Response to Real-World Multimodal Sensory Inputs with LLMs	2024
AXNav: Replaying Accessibility Tests from Natural Language	2024
Human-LLM Collaborative Annotation Through Effective Verification of LLM Labels	2024
Reducing the Search Space on demand helps Older Adults find Mobile UI Features quickly, on par with Younger Adults	2024
An Empathy-Based Sandbox Approach to Bridge the Privacy Gap among Attitudes, Goals, Knowledge, and Behaviors	2024
C2Ideas: Supporting Creative Interior Color Design Ideation with a Large Language Model	2024
Think Fast, Think Slow, Think Critical: Designing an Automated Propaganda Detection Tool	2024
Understanding the Impact of Long-Term Memory on Self-Disclosure with Large Language Model-Driven Chatbots for Public Health Intervention	2024
Evaluating Large Language Models on Academic Literature Understanding and Review: An Empirical Study among Early-stage Scholars	2024
Scientific and Fantastical: Creating Immersive, Culturally Relevant Learning Experiences with Augmented Reality and Large Language Models	2024
Marco: Supporting Business Document Workflows via Collection-Centric Information Foraging with Large Language Models	2024
Leveraging Prompt-Based Large Language Models: Predicting Pandemic Health Decisions and Outcomes Through Social Media Language	2024
“As an AI language model, I cannot”: Investigating LLM Denials of User Requests	2024
Teachers, Parents, and Students’ perspectives on Integrating Generative AI into Elementary Literacy Education	2024
The HaLLMark Effect: Supporting Provenance and Transparent Use of Large Language Models in Writing with Interactive Visualization	2024
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM	2024
CloChat: Understanding How People Customize, Interact, and Experience Personas in Large Language Models	2024
Understanding the Role of Large Language Models in Personalizing and Scaffolding Strategies to Combat Academic Procrastination	2024
ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-12	2024
MindTalker: Navigating the Complexities of AI-Enhanced Social Engagement for People with Early-Stage Dementia	2024
A Piece of Theatre: Investigating How Teachers Design LLM Chatbots to Assist Adolescent Cyberbullying Education	2024
MindfulDiary: Harnessing Large Language Model to Support Psychiatric Patients’ Journaling	2024
Art or Artifice? Large Language Models and the False Promise of Creativity	2024
PlantoGraphy: Incorporating Iterative Design Process into Generative Artificial Intelligence for Landscape Rendering	2024
Eternagram: Probing Player Attitudes Towards Climate Change Using a ChatGPT-driven Text-based Adventure	2024
The Illusion of Empathy? Notes on Displays of Emotion in Human-Computer Interaction	2024
Open Sesame? Open Salami! Personalizing Vocabulary Assessment-Intervention for Children via Pervasive Profiling and Bespoke Storybook Generation	2024
Silver-Tongued and Sundry: Exploring Intersectional Pronouns with ChatGPT	2024
Learning Agent-based Modeling with LLM Companions: Experiences of Novices and Experts Using ChatGPT & NetLogo Chat	2024
DirectGPT: A Direct Manipulation Interface to Interact with Large Language Models	2024
Human-Algorithmic Interaction Using a Large Language Model-Augmented Artificial Intelligence Clinical Decision Support System	2024
Automatic Macro Mining from Interaction Traces at Scale	2024
Towards Designing a Question-Answering Chatbot for Online News: Understanding Questions and Perspectives	2024
ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing	2024
MUD: Towards a Large-Scale and Noise-Filtered UI Dataset for Modern Style UI Modeling	2024
VIVID: Human-AI Collaborative Authoring of Vicarious Dialogues from Lecture Videos	2024
PaperWeaver: Enriching Topical Paper Alerts by Contextualizing Recommended Papers with User-collected Papers	2024
“If the Machine Is As Good As Me, Then What Use Am I?” - How the Use of ChatGPT Changes Young Professionals’ Perception of Productivity and Accomplishment	2024
CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language Models	2024
PANDALens: Towards AI-Assisted In-Context Writing on OHMD During Travels	2024
LLMR: Real-time Prompting of Interactive Worlds using Large Language Models	2024
Surgment: Segmentation-enabled Semantic Search and Creation of Visual Question and Feedback to Support Video-Based Surgery Learning	2024
Writer-Defined AI Personas for On-Demand Feedback Generation	2024
ChaCha: Leveraging Large Language Models to Prompt Children to Share Their Emotions about Personal Events	2024
Understanding Underground Incentivized Review Services	2024
Multimodal Healthcare AI: Identifying and Designing Clinically Relevant Vision-Language Applications for Radiology	2024
How Beginning Programmers and Code LLMs (Mis)read Each Other	2024
Selenite: Scaffolding Online Sensemaking with Comprehensive Overviews Elicited from Large Language Models	2024
Rambler: Supporting Writing With Speech via LLM-Assisted Gist Manipulation	2024
Teach AI How to Code: Using Large Language Models as Teachable Agents for Programming Education	2024
The Promise and Peril of ChatGPT in Higher Education: Opportunities, Challenges, and Design Implications	2024
Narrating Fitness: Leveraging Large Language Models for Reflective Fitness Tracker Data Interpretation	2024
Human I/O: Towards a Unified Approach to Detecting Situational Impairments	2024
VAL: Interactive Task Learning with GPT Dialog Parsing	2024
From Text to Self: Users’ Perception of AIMC Tools on Interpersonal Communication and Self	2024
More than Model Documentation: Uncovering Teachers’ Bespoke Information Needs for Informed Classroom Integration of ChatGPT	2024
Putting Things into Context: Generative AI-Enabled Context Personalization for Vocabulary Learning Improves Learning Motivation	2024
EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria	2024
AQuA: Automated Question-Answering in Software Tutorial Videos with Visual Anchors	2024
If in a Crowdsourced Data Annotation Pipeline, a GPT-4	2024
“It’s the only thing I can trust”: Envisioning Large Language Model Use by Autistic Workers for Communication Assistance.	2024
Designing Accessible Obfuscation Support for Blind Individuals’ Visual Privacy Management.	2024
Testing, Socializing, Exploring: Characterizing Middle Schoolers’ Approaches to and Conceptions of ChatGPT.	2024
DiaryMate: Understanding User Perceptions and Experience in Human-AI Collaboration for Personal Journaling.	2024
Under the (neighbor)hood: Hyperlocal Surveillance on Nextdoor.	2024
Rehearsal: Simulating Conflict to Teach Conflict Resolution.	2024
Debate Chatbots to Facilitate Critical Thinking on YouTube: Social Identity and Conversational Style Make A Difference.	2024
ClassMeta: Designing Interactive Virtual Classmate to Promote VR Classroom Participation.	2024
The Role of Inclusion, Control, and Ownership in Workplace AI-Mediated Communication.	2024
Co-Designing QuickPic: Automated Topic-Specific Communication Boards from Photographs for AAC-Based Language Instruction.	2024
Integrating Expertise in LLMs: Crafting a Customized Nutrition Assistant with Refined Template Instructions.	2024
Supporting Sensemaking of Large Language Model Outputs at Scale.	2024
Bridging the Gulf of Envisioning: Cognitive Challenges in Prompt Based Interactions with LLMs.	2024
The Value, Benefits, and Concerns of Generative AI-Powered Assistance in Writing.	2024
Shaping Human-AI Collaboration: Varied Scaffolding Levels in Co-writing with Language Models.	2024
Intelligent Support Engages Writers Through Relevant Cognitive Processes.	2024
Generative Echo Chamber? Effect of LLM-Powered Search Systems on Diverse Information Seeking.	2024
CoPrompt: Supporting Prompt Sharing and Referring in Collaborative Natural Language Programming.	2024
EmoEden: Applying Generative Artificial Intelligence to Emotional Learning for Children with High-Function Autism.	2024
BLIP: Facilitating the Exploration of Undesirable Consequences of Digital Technologies.	2024
AI-Driven Mediation Strategies for Audience Depolarisation in Online Debates.	2024
AI-Augmented Brainwriting: Investigating the use of LLMs in group ideation.	2024
“It’s a Fair Game”, or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents.	2024
Towards AI-Driven Healthcare: Systematic Optimization, Linguistic Analysis, and Clinicians’ Evaluation of Large Language Models for Smoking Cessation Interventions.	2024
Advancing Patient-Centered Shared Decision-Making with AI Systems for Older Adult Cancer Patients.	2024
A Design Space for Intelligent and Interactive Writing Assistants.	2024

Citation

If you find this useful in your research, please consider citing this paper:

@article{pang2025understanding,
  title={Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review},
  author={Rock Yuren Pang and Hope Schroeder and Kynnedy Simone Smith and Solon Barocas and Ziang Xiao and Emily Tseng and Danielle Bragg},
  journal={arXiv preprint arXiv:2501.12557},
  year={2025},
  url={https://arxiv.org/abs/2501.12557}
}

Acknowledgement

This work wouldn’t be possible without the research inspiration from my advisor at UW, Katharina Reinecke. We thank the anonymous reviewers for their valuable feedback. We also thank Jenn Wortman Vaughan, Kevin Feng, Mohammed Alsobay, Sachita Nishal, Harsh Kumar, Shivani Kapania, Katelyn Mei, Enhao Zhang, Sandy Kaplan and many more friends and mentors at the University of Washington and Microsoft Research for their research inspirations, fun conversations, and helpful suggestions.

Contact

I’m looking forward to understanding this line of work beyond CHI. If you have feedback for our paper, or are interested in chatting or collaborating, please don’t hesitate to contact: Rock Yuren Pang <ypang2@cs.washington.edu>