Blog

Adversarial pop-ups trick AI agents into clicking malicious links

November 10, 2024

Robot browsing the web — Image created with DALL-E 3

This article is part of our coverage of the latest in AI research.

Anthropic recently released a feature that enables Claude, its large language model (LLM), to automatically accomplish tasks by controlling the user’s computer, opening apps, and browsing the web. Visual agents that understand graphical user interfaces and perform actions are becoming frontiers of competition in the AI arms race. Google is reportedly working on its own version of a computer-controlling AI agent.

However, a new study shows that these agents are vulnerable to adversarial attacks that exploit their lack of real-world knowledge. Researchers from Georgia Tech, the University of Hong Kong, and Stanford University have demonstrated that carefully designed pop-up windows can easily trick visual agents.

These agents use vision-language models (VLMs) to interpret graphical user interfaces (GUI) like web pages or screenshots. Given a user request, the agent parses the visual information, locates the relevant elements on the page, and takes actions like clicking buttons or filling forms. This makes it possible to automate complex web tasks.

However, the web is full of deceptive practices designed to trick users into clicking malicious links or downloading malware. These include pop-up ads, fake download buttons, and countdown timers for fake deals. Experienced human users have learned to recognize these deceptive patterns and mostly ignore them.

The researchers decided to test if visual agents can recognize malicious pop-ups that are easily recognized by humans. They considered a threat model where attackers manipulate the observations of the agent, such as screenshots and accessibility tree data. The goal of the attacker is to trick the agent into clicking the malicious pop-up.

The researchers explored four different attack elements:

Attention Hook: A few words in the pop-up designed to grab the agent’s attention

Instruction: The desired behavior that the attacker wants the agent to follow

Information Banner: Information that either explains or misleads the agent about the purpose of the pop-up

ALT Descriptor: Textual information embedded in the accessibility tree to describe the content of the pop-up to screen readers.

They carefully designed and positioned adversarial pop-ups on web pages and tested their effects on several frontier VLMs, including different variants of GPT-4, Gemini, and Claude.

The results of the experiments show that all tested models were highly susceptible to the adversarial pop-ups, with attack success rates (ASR) exceeding 80% on some tests.

Even with basic defense techniques, such as explicitly instructing the agent to ignore pop-ups or adding advertisement notices, the agents still clicked on the malicious links. The researchers found that adding an instruction to ignore pop-ups only reduced the attack success rate by a maximum of 25%.

Surprisingly, the researchers found that detailed information about the web page, such as the user’s original query, the position of the pop-up, and the specific architecture of the agent’s framework, were not essential for carrying out successful attacks. A simple pop-up with generic instructions was enough to trick most agents into clicking the malicious link.

“Even though these pop-ups look very suspicious (by design) to human users, agents cannot distinguish the difference between pop-ups and typical digital content, even with extra advertisement notices,” the researchers write. “We believe more detailed and specific defense strategies are necessary to mitigate risks systematically.”

This study highlights the risks of the territory we’re entering as we unleash AI agents in environments that are designed for humans. The different reasoning mode of LLMs makes them susceptible to attacks that would not fool average human users. This further underscores the need to think about new design principles for applications and websites that will be accessed by AI agents.

Making data work for you: Challenges, innovations, and lessons learned

Under the hood: The Innovations powering DeepSeek’s AI breakthrough

What to know about Meta’s Llama 4 model family

Understanding OpenAI’s pivot to releasing open source models

What to know about Google Gemini 2.5 Pro

How looking differently at data can save your machine learning project

Building a solid data foundation for generative AI applications

These mistakes can ruin your machine learning project

How to choose metrics for your machine learning projects

How to start with machine learning in your organization

How Google’s Agent2Agent can boost AI productivity through inter-agent communication

Demystifying vibe coding: Hype, reality, and why you still need to…

What is Model Context Protocol (MCP)?

What to know about Claude 3.7 Sonnet, Anthropic’s new frontier language…

Everything you need to know about Grok-3

The cash flow conundrum: How technology is reshaping small business finance

What to know about the security of open-source machine learning models

Understanding the impact of open-source language models

What we learned from the deep learning revolution

AI21 Labs’ mission to make large language models get their facts…

Adversarial pop-ups trick AI agents into clicking malicious links

Like this:

Leave a ReplyCancel reply

Like this:

Leave a ReplyCancel reply

Discover more from TechTalks