publications

2026

  1. Measuring and Mitigating the Distributional Gap Between Real and Simulated User Behaviors
    Shuhaib Mehri, Philippe Laban, Sumuk Shashidhar, Marwa Abdulhai, Sergey Levine, Michel Galley, and Dilek Hakkani-Tür
    Preprint, 2026
  2. PSI-Bench: Towards Clinically Grounded and Interpretable Evaluation of Depression Patient Simulators
    Nguyen Khoi Hoang, Shuhaib Mehri, Tse-An Hsu, Yi-Jyun Sun, Quynh Xuan Nguyen Truong, Khoa D Doan, and Dilek Hakkani-Tür
    2026
  3. User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction
    Yuren Hao, Shuhaib Mehri, ChengXiang Zhai, and Dilek Hakkani-Tür
    2026
  4. Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration
    Priyanka Kargupta, Shuhaib Mehri, Dilek Hakkani-Tur, and Jiawei Han
    2026
  5. MultiSessionCollab: Learning User Preferences with Memory to Improve Long-Term Collaboration
    Shuhaib Mehri, Priyanka Kargupta, Tal August, and Dilek Hakkani-Tür
    Preprint, 2026

2025

  1. oral
    Goal Alignment in LLM-Based User Simulators for Conversational AI
    Shuhaib Mehri, Xiaocheng Yang, Takyoung Kim, Gokhan Tur, Shikib Mehri, and Dilek Hakkani-Tür
    TACL 2025 (Oral at ACL 2026), 2025
  2. Must Read: A Systematic Survey of Computational Persuasion
    Nimet Beyza Bozdag, Shuhaib Mehri, Xiaocheng Yang, Hyeonjeong Ha, Zirui Cheng, Esin Durmus, Jiaxuan You, Heng Ji, Gokhan Tur, and Dilek Hakkani-Tür
    Preprint, 2025
  3. AURA: A Diagnostic Framework for Tracking User Satisfaction of Interactive Planning Agents
    Takyoung* Kim, Janvijay* Singh, Shuhaib* Mehri, Emre Can Acikgoz, Sagnik Mukherjee, Nimet Beyza Bozdag, Sumuk Shashidhar, Gokhan Tur, and Dilek Hakkani-Tür
    NeurIPS 2025 MTI-LLM Workshop, 2025
  4. Persuade me if you can: A framework for evaluating persuasion effectiveness and susceptibility among large language models
    Nimet Beyza Bozdag, Shuhaib Mehri, Gokhan Tur, and Dilek Hakkani-Tür
    NeurIPS 2025 MTI-LLM Workshop, 2025
  5. oral
    Beyond Sample-Level Feedback: Using Reference-Level Feedback to Guide Data Synthesis
    Shuhaib Mehri, Xiusi Chen, Heng Ji, and Dilek Hakkani-Tür
    EACL 2026 (Oral), 2025

2024

  1. Discourse Relation Recognition with Language Models Under Different Data Availability
    Shuhaib Mehri, Chuyuan Li, and Giuseppe Carenini
    In Proceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences (CODI 2025), 2024

2023

  1. Automatic Evaluation of Generative Models with Instruction Tuning
    Shuhaib Mehri and Vered Shwartz
    In Proceedings of the Third Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), 2023