Overview

Recent advancements in large language models (LLMs) have achieved superhuman performance across various domains and tasks, unlocking their potential in real-world applications. However, robustness and trustworthiness issues hinder their reliable deployment. Robustness issues refer to inconsistent model behaviors under equivalent conditions, such as sensitivity to minor prompt variations. Trustworthiness encompasses issues like hallucinations–situations where LLMs produce factually incorrect or input-conflicting outputs–and fairness issues, including biases toward certain races, genders, value systems, or languages. Addressing these is crucial for building reliable applications with these powerful yet vulnerable models.

We will conclude the tutorial by outlining future research directions in this area.

Speakers

Cheng-Kuang Wu
Appier
Zhi Rui Tam
NTU
Kuan-Hao Huang
Texas A&M University

Outline

Introduction

Definitions, importance, and taxonomy of robustness and trustworthiness issues

Adversarial attacks and jailbreaking

Intentional disruptions in model behavior (Zou et al., 2023, Chao et al., 2025).

Prompt variations

Effects of semantically equivalent prompt changes on model performance (Sclar et al., 2024).

Position bias

Bias toward information based on input position (Wang et al., 2025); impact on tasks like pairwise response evaluation (Zheng et al., 2023), multi-choice questions (Zheng et al., 2024), and retrieval-augmented generation (RAG) (Liu et al., 2024).

Hallucinations

When the LLM’s generation is factually incorrect, contradicts with its own generation, or conflicts with the provided input context (Huang et al., 2025).

Fairness and social biases

Biases in ethnicity, gender, value systems, or languages.

Reasoning models

Specific issues introduced by LLMs trained to generate reasoning tokens before providing the final answer (Kumar et al., 2025).

Multimodal models

Issues introduced by models that process multimodal inputs (Tong et al., 2024).

Conclusion and open challenges

Summary of best practices for developing robust and trustworthy LLMs, open research questions, and future directions.