Model BreakdownsFrance
OpenAdversarial Explanation Attacks: How LLM Framing Preserves User Trust in Incorrect Outputs
Describes 'adversarial explanation attacks'—how LLM explanation framing keeps users trusting incorrect outputs. Reports a 205‑participant study and gives pragmatic builder controls.