When I started working with computers way back in 1995, one of the first lessons I learnt was to keep things simple because the more complicated or more layers you have in your system the more ways there are for things to go wrong and more attack surfaces are available for a bad actor to target. This was called the KISS (Keep It Simple Stupid) principle. With the current systems adding more and more complexity it feels like people have stopped following that advice. Especially with LLM/AI getting added there is a layer of complexity that is like a black box because we can’t know enough about the model being used, such as what data was used to train it, what biases are included (knowingly or unknowingly) into the model etc.
Where cars used to be simple mechanical devices they are now instead computers on wheels that are getting more and more complicated. As per IEEE, a typical car may use 100 million lines of code and this is without AI/Self Driving systems coming into the picture.
We now have AI systems running on Cars that use models to drive cars, decide when to stop and what rules to follow. To explore the risk, researchers at the University of California, Santa Cruz, and Johns Hopkins tested the AI systems and the large vision language models (LVLMs) underpinning them and found that they would reliably follow instructions if displayed on signs held up in their camera’s view. This research adds to the growing list of evidence that AI decision-making can easily be tampered with, which is a major concern because a lot of decisions are slowly being outsourced to these “AI” systems some of which can have serious consequences.
The researchers have published their findings in a paper where they introduce CHAI (Command Hijacking against embodied AI), a physical environment indirect prompt injection attack that exploits the multimodal language interpretation abilities of AI models.
Abstract: Embodied Artificial Intelligence (AI) promises to handle edge cases in robotic vehicle systems where data is scarce by using common-sense reasoning grounded in perception and action to generalize beyond training distributions and adapt to novel real-world situations. These capabilities, however, also create new security risks. In this paper, we introduce CHAI (Command Hijacking against embodied AI), a new class of prompt-based attacks that exploit the multimodal language interpretation abilities of Large Visual-Language Models (LVLMs). CHAI embeds deceptive natural language instructions, such as misleading signs, in visual input, systematically searches the token space, builds a dictionary of prompts, and guides an attacker model to generate Visual Attack Prompts. We evaluate CHAI on four LVLM agents; drone emergency landing, autonomous driving, and aerial object tracking, and on a real robotic vehicle. Our experiments show that CHAI consistently outperforms state-of-the-art attacks. By exploiting the semantic and multimodal reasoning strengths of next-generation embodied AI systems, CHAI underscores the urgent need for defenses that extend beyond traditional adversarial robustness.
Potential consequences include self-driving cars proceeding through crosswalks without regard to humans crossing it, taking passengers to a different destination (potentially allowing bad actors to kidnap people), getting the car into an accident by forcing it to ignore traffic rules/oncoming traffic.
Source: schneier.com: Prompt Injection Via Road Signs
– Suramya


