The event will take place at Howard University. See Directions for more details.

Schedule

Oral Presentations

  1. 14:15 - 14:25: SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards
  2. 14:30 - 14:40: Intergroup Bias Affects Expressions of Social Judgment and Empathy in the r/AmITheAsshole Reddit Community
  3. 14:45 - 14:55: Function Words as Statistical Cues for Language Learning
  4. 15:00 - 15:10: Should We be Pedantic About Reasoning Errors in Machine Translation?

Poster Sessions

Session 1

  1. Syntactic Information Content in Word Duration and Pause
  2. muRST: an Analysis of Cross-lingual Variation in Discourse Structure
  3. Language Models Don’t Know What You Want: Evaluating Personalization in Deep Research Needs Real Users
  4. What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features
  5. HRI-GRPO: GRPO with Human Reference Injection for Long-Form Story Generation
  6. GUMBridge: a New, Genre Diverse Resource for Bridging Anaphora
  7. Acoustic Decolonization and Linguistic Erasure: Estimating Acoustic Model Bias in the Forced Alignment of Nigerian English
  8. Perception and social meanings of nuclear configurations in Nigerian English
  9. A Benchmark for Temporal Norm Understanding
  10. NeuroNote: Multi-Agent Grounded Reasoning for Reading-Order Aware Scientific Figure Explanation
  11. Group Preference Alignment: Customized LLM Response Generation from In-Situ Conversations
  12. Fine-grained Readability Controlled Summarization of Scientific Documents via Control Vectors
  13. Same Verdict, Different Reasons: LLM-as-a-Judge and Clinician Disagreement on Medical Chatbot Completeness
  14. Reference-Adjusted Summarization Evaluation
  15. The Persuasion Index: An Interpretable Framework for Quantifying Rhetorical Strategies Across Domains
  16. Fine-Tuning DistilBERT, DeBERTa and ModernBERT for Valence–Arousal Prediction and Change Estimation
  17. AI use in American newspapers is widespread, uneven, and rarely disclosed
  18. AutoFiction: Measuring AI ability to execute long-horizon writing task
  19. Can you map it to English? The Role of cross-lingual alignment in multilingual {LLM}s
  20. Beyond Hallucination: Temporal Knowledge Asymmetry as a Distinct Failure Mode in Large Language Models for Non-Western Knowledge Domains
  21. Artificial Intolerance: How Stigmatizing Language Skews Medical Decision-Making in LLMs
  22. Aware but Agreeable: LLMs Recognise Distress but Still Respond Unsafely in Delusional Conversations
  23. Punctuation Restoration for Linguistic Structure Learnin
  24. Reheat Nachos for Dinner? Evaluating AI Support for Cross-Cultural Communication of Neologisms
  25. Credibility as a Societal Construct: Uncovering Cross-National Patterns of Information Perception in Public Discourse
  26. GeopoliScope: Tracing Geopolitical Stance Formation in Language Model Hidden Representations

Session 2

  1. What Do We Mean by ``Pilot Study'':\\ A Meta-Review of Pilot Study Reporting at CHI
  2. Take Out Your Calculators: Estimating the Real Difficulty of Question Items with LLM Student Simulations
  3. DiscoTrace: Representing and Comparing Answering Strategies of Humans and LLMs in Information-Seeking Question Answering
  4. Generating “It is, I think, a mistake” and other linguistic examples with LLMs
  5. Sycophancy Undermines Epistemic Vigilance in Cooperative Vision-Language Tasks
  6. Metaphors of AI in News Media: A Multimodal Exploration
  7. Structured Uncertainty Guided Clarification for LLM Agents
  8. Do Evaluation Metrics Detect Errors in Classical Chinese to English Translations?
  9. What AI Cannot Write: Substance Gaps in AI Non-Fiction Essays
  10. Tactical Decision Making, Towards a Large Language Model
  11. Systematic Review: A Framework for Understanding Microaggressions in Human-Centered AI
  12. Integrated Spoofing-Robust Automatic Speaker Verification via a Three-Class Formulation and LLR
  13. Syntactic Augmentation for In-Context Coptic Translation
  14. Seeing What Models Miss: Automatic Adversarial VQA Generation
  15. Evaluating Multi-Modal Large Language Models across Text and Audio Modalities for Accessible Disaster Assistance
  16. Towards Understanding Pre-training: Large Language Models Need Diverse Views To Learn Complex Knowledge
  17. The Cost of Clarity: Do AI-Generated Partisan Scaffolds Trade Willingness to Talk for Willingness to Act?
  18. Predicting 90-day Cardiovascular Disease Readmission in the IC
  19. Ibom NLP: A Step Toward Inclusive Natural Language Processing for Nigeria’s Minority Languages
  20. Reversing Model Collapse in Large Language Models via Task Vector Negation
  21. Anchoring Depends on Confidence and Post-Training in Language Models
  22. Beyond ``Summaries'' and ``Action Items'': Exploring AI Support for Sensemaking and Ideation from Past Research Meeting
  23. Studying the differences in communication strategies by academic faculty member ranks on an online academic forum
  24. Analyzing Emotional Expression in Online Mental Health Forums Across Ethnicity and Gender
  25. How do different Input Distributions Shape Constructional Acquisition in LMs? A Case Study of NPNs
  26. Apollo: A Modular Multi-Agent Ecosystem for Sports Analytics

Keynote Speakers

Kianté Brantley

Dr. Kianté Brantley

Assistant Professor of Computer Science
Harvard University


Dr. Kianté Brantley is an Assistant Professor of Computer Science at Kempner Institute and School of Engineering and Applied Sciences (SEAS) at Harvard University. His research focuses on problems at the intersection of machine learning and interactive decision-making, with the goal of improving the decision-making capabilities of foundation models. This involves studying, building, and improving techniques in reinforcement learning, imitation learning, and natural language processing.


Talk Title:
Regression as Policy Optimization: Advantages In, Policies Out

Abstract:
Reinforcement learning (RL) is an essential method for training large language models (LLMs), enabling better alignment with human preferences and enhanced reasoning capabilities. Nevertheless, RL post-training remains computationally demanding due to repeated rollouts, high-variance credit assignment, and the complexities of distributed systems that can introduce policy lag. A promising direction is to view KL-regularized policy optimization as a KL-prox (mirror-descent–style) step and solve it with a simple least-squares regression loss. This regression-based perspective addresses key RL challenges and enables more efficient post-training procedures.
In this talk, I will introduce a unified perspective encompassing three complementary approaches. A⋆-PO minimizes reliance on online sampling by utilizing offline value computation and optimal-advantage regression. OAPL supports scalable, fully off-policy regression, even when using stale rollouts and lagged policies in distributed settings. RDA2C offers a regularized dual-averaging approach that employs cumulative gradient information to stabilize updates and reduce variance, rather than relying on local, per-round mirror-descent steps. Although RDA2C has been primarily assessed on standard RL benchmarks rather than comprehensive LLM post-training, its focus on variance reduction and data reuse aligns closely with the stability challenges encountered in large-scale LLM alignment. Collectively, these methods provide an efficient toolkit for RL-based LLM post-training and present opportunities for further research and scalable deployment.


Mohit Iyyer

Dr. Mohit Iyyer

Associate Professor of Computer Science
University of Maryland, College Park


Dr. Mohit Iyyer is an associate professor in computer science at University of Maryland, College Park and a member of CLIP. Previously, he was an associate professor at UMass CS, a Young Investigator at AI2, and a PhD student at UMD CS. His research focuses on natural language processing and large language models. He is currently excited about; (a) long-form text generation, (b) long-context language understanding, (c) AI agents for collaborative writing, and (d) AI-generated text detection.


Talk Title:
Detecting and characterizing AI use in collaborative writings

Abstract:
Nowadays, lots of people are using AI to "help" write things. For instance, our recent study finds that ~9% of recently-published American newspaper articles are produced either partially or entirely by LLMs. However, most of this AI use is undisclosed to readers, raising important concerns of transparency and public trust. In this talk, I discuss the broad spectrum of AI-human writing, from minor edits for style/grammar to the generation of large blocks of content, in the context of what might be acceptable and unacceptable to a reader given lack of disclosure.
I'll then discuss EditLens, an automatic method that performs post-hoc AI detection: given a piece of text, can we tell to what extent AI was used to assist in its writing? Next, I'll turn to the frontier of AI writing: how good can fully AI-generated writing actually get? To explore this, I'll introduce autofiction.ai, our new web platform where readers can freely read, rate, and discuss full-length novels produced by frontier AI agents under transparent disclosure. Finally, I'll describe StoryScope, which conceptually moves AI detection beyond surface-level stylistic cues (which future models may not exhibit) by analyzing discourse-level narrative features such as character agency, temporal complexity, and moral ambiguity.


Gloria Washington

Dr. Gloria Washington

Associate Professor of Computer Science
Howard University


Dr. Gloria Washington is currently an Associate Professor of Computer Science at Howard University in Washington, DC. She was recently awarded the EBONY Magazine Power 100 Award for being a STEM Trailblazer 2025! She runs the Affective Biometrics Lab with her bright students. Her most notable project is Project Elevate Black Voices https://www.elevateblackvoiceshu.com. She is an empathetic technology researcher that focuses on the intersection of human-centered computing, affective computing, and biometrics. She likes to say her research seeks to give voices to everyone that felt silenced by asking questions like: how can technology impact positive human emotions while reducing barriers to entering technology and how can technology build lasting social impact through requiring persons to feel empathy...not just look away?


Talk Title:
The Origins of Our Discontent with Voice Assistant Technology

Abstract:
Despite their rapid diffusion into homes, appliances, and workplaces, voice assistant technologies continue to inspire widespread frustration, mistrust, and uneven adoption. Often framed as intuitive, neutral, and universally accessible interfaces to computation, these tools fail for many speakers of marginalized dialects and languages. This talk introduces the concept of Dialect Caste to explain the structural origins of our discontent with voice assistant technology found in everything.
Dialect Caste describes the hierarchical ordering of speech varieties within automated speech recognition (ASR) systems, in which dominant, standardized forms of language are treated as normative while other ways of speaking are systematically misrecognized or devalued. Drawing on sociolinguistics, critical race studies, and human–computer interaction, the talk situates contemporary voice assistants within longer histories of linguistic stratification tied to race, class, colonialism, and power.
It examines how core ASR techniques—statistical modeling, data-driven acoustic training, and lexicon design—have traditionally failed speakers of African American English and African creole–based languages. These failures found in the design of automatic speech recognition are not accidental: they stem from training data that underrepresents the unique morphosyntax, phonology, and prosody found in AAE and African creole-based languages. The talk highlights Project Elevate Black Voices, led by Howard University, as a critical intervention that challenges these norms by centering Black speech communities in the design, collection, and evaluation of voice technologies. Rather than treating dialectal variation as noise to be corrected, the project EBV reframes linguistic diversity as a design requirement and a site of epistemic authority. Finally, by analyzing everyday breakdowns in voice interaction through the lens of Dialect Caste, the paper argues that technical misrecognition produces social harm, reinforcing longstanding patterns of linguistic discrimination while shifting the burden of adaptation onto marginalized speakers. The talk concludes by calling for dialect-aware ASR paradigms and participatory governance models that treat voice technologies not merely as engineering artifacts, but as institutions that mediate whose voices are heard, valued, and understood.