Why Your Interface Might Still Be Broken for AI Agents (Even with Computer Vision)

Marcos Nähr

Director of Product Design | Scaled Global UX Teams | Ex-Dell

Published Jun 30, 2025

Imagine your most important user can now see your interface like a human, but still gets completely stuck on tasks that seem perfectly obvious.

This is the reality of modern AI agents interacting with our interfaces today.

The landscape changed dramatically in late 2024. AI agents no longer just read code structure. They can actually see your interface, take screenshots, and navigate visually just like humans do. Anthropic's Claude Computer Use, Microsoft's Copilot Vision, and Google's Project Mariner all demonstrate this new reality. Yet most enterprise interfaces still create massive roadblocks for these visually capable agents.

AI agents now use a hybrid approach that combines computer vision with traditional automation methods. They can see visual layouts and understand design patterns, but they also need structured data access and predictable interaction patterns. This creates a more complex design challenge than pure visual or pure code based approaches.

While agents can now recognize a submit button visually, they still struggle with inconsistent labeling, ambiguous states, and unpredictable workflows. The difference is that design failures now manifest differently. Instead of being completely blind to visual cues, agents might see the button but misunderstand its context or purpose within a complex workflow.

Interface automation agents dominated commercial AI deployments in 2024. Companies like Replit, Asana, and Canva are already using these capabilities for multi step tasks that require dozens or hundreds of actions.

But success rates on complex workflows still hover around 14% for AI Agents compared to 78% for humans, revealing massive opportunities for improvement.

This creates direct business implications that most organizations haven't recognized yet. Integration costs still drop significantly when interfaces support both visual and programmatic automation. Operational efficiency improves when routine tasks can be automated reliably across different interaction modalities. Partnership opportunities expand when other organizations can integrate through multiple pathways.

The strategic opportunity for design leaders has actually expanded. Most organizations understand that agents can "see" interfaces now, but they don't understand that visual capability alone doesn't solve automation challenges. Agents need interfaces that work excellently across visual recognition, structured data access, and contextual understanding simultaneously.

This positions design leaders who understand the full spectrum of agent capabilities ahead of those focusing only on visual accessibility or only on technical structure. When you can articulate how interface design decisions affect automation success rates across different agent interaction methods, you're speaking business strategy with technical depth.

The accessibility connection has become even stronger. Agents that work through visual screenshots benefit from clear visual hierarchy and consistent design patterns. Agents that need structured data access require semantic markup and logical information architecture. Agents that use natural language processing need clear, contextual labeling. All three approaches benefit from the same foundational design principles.

Building organizational awareness now requires understanding that agents are sophisticated users with multiple ways of perceiving interfaces, not simple automation scripts. Product teams need agent scenarios that account for visual, structural, and contextual interaction patterns. Engineering teams need to understand that design choices affect success rates across different automation approaches.

The biggest challenge isn't technical implementation anymore. It's designing interfaces that leverage the full capabilities of modern agents while remaining excellent for humans.

This means creating visual hierarchies that computer vision can parse accurately, information architectures that support multiple access methods, and interaction patterns that work reliably across different agent capabilities.

Current agents can handle complex, multi step workflows when interfaces support their hybrid interaction approach. They're seeing step function improvements in performance as capabilities evolve rapidly. The organizations that understand how to design for this new generation of agents will have significant advantages in an increasingly automated business ecosystem.

Start by auditing your key workflows through a modern agent lens. Ask yourself: could current AI agents with visual, structural, and contextual understanding reliably complete these tasks? The gap between current agent capabilities and interface design is where the biggest competitive opportunities exist.

What have you noticed about how modern AI agents interact with your products? Are you designing for their full range of capabilities or just one interaction method?

Why Your Interface Might Still Be Broken for AI Agents (Even with Computer Vision)

Marcos Nähr

Director of Product Design | Scaled Global UX Teams | Ex-Dell

Imagine your most important user can now see your interface like a human, but still gets completely stuck on tasks that seem perfectly obvious.

More articles by this author

Others also viewed

Shaping the Future of Digital Experiences: AI, Open Source, and Personalization

Building Experiences Using AI - 7: 'ProDaLiDE' - NextGen Prototyping for AI & Data driven Web Applications

Thoughts on AI/UX: Quality, Bespokeness, and Adaptation

Emumba’s North Star - The AI Designer: Amplifying Creativity, Not Replacing It

Generative AI: Revolutionizing Product and Service Design

From Generative AI to Generative UI: The Next Frontier in User Interface Design

A novel approach to designing bot personas

UI-Less vs. Less UI

The Architecture of Design Agents: A Behind-the-Scenes Look

DX Perspective: Design For The AI Age, Intentional Org Design, and The Art of Not Knowing

Explore topics

Imagine your most important user can now see your interface like a human, but still gets completely stuck on tasks that seem perfectly obvious.

The UX Influence Equation

Jul 24, 2025

The Scripts That Shape UX Teams

Jul 11, 2025

AI Integration for Design Organizations

Jul 10, 2025

Caring for the Right Thing

Jun 26, 2025

Your Interface Has Two Users Now

Jun 22, 2025

The AI Debt Trap

Jun 16, 2025

Translating Business Strategy into UX Outcomes

Jun 6, 2025

The Design Lead Model

May 30, 2025

When Software Becomes Truly Personal

May 25, 2025

From Hope to Evidence in Design

May 19, 2025

Others also viewed

Shaping the Future of Digital Experiences: AI, Open Source, and Personalization

Building Experiences Using AI - 7: 'ProDaLiDE' - NextGen Prototyping for AI & Data driven Web Applications

Thoughts on AI/UX: Quality, Bespokeness, and Adaptation

Emumba’s North Star - The AI Designer: Amplifying Creativity, Not Replacing It

Generative AI: Revolutionizing Product and Service Design

From Generative AI to Generative UI: The Next Frontier in User Interface Design

A novel approach to designing bot personas

UI-Less vs. Less UI

The Architecture of Design Agents: A Behind-the-Scenes Look

DX Perspective: Design For The AI Age, Intentional Org Design, and The Art of Not Knowing

Explore topics