🤖 How to trick an AI reviewer without hidden prompts
Researchers have discovered a vulnerability called Adversarial Repackaging. It allows for the artificial inflation of scientific paper scores by using only changes in text framing and structure, without altering the scientific essence. In tests, the attack was successful in 75.1% of cases.
🌍 AI reviewing systems could become objects of optimization not through scientific quality, but through the manipulation of model interpretative biases. This creates a risk of large-scale "gaming" of the system.
👤 If scientific communities move en masse to AI verification, the real value of discoveries could be diluted by presentations "trained" for AI that look convincing to the algorithm.
Source 1: https://arxiv.org/abs/2606.13044
