From Balloon Help to AI Agents: How GPT Is Rewriting the Rules of Software Interaction

Mark Murphy

Driving Hybrid Digital Transformation | All things AI

Published Apr 19, 2025

Think back in time and remember the old "Help" menu on your Mac? Back in 1991, Apple's System 7 introduced Balloon Help—a simple, yet revolutionary feature that let you hover over any button or menu and instantly get a bite-sized explanation. It was the first taste of contextual assistance, a guiding hand embedded right inside the software. Fast forward to today, and we're witnessing a leap that makes Balloon Help look quaint: AI agents powered by large language models (LLMs) are not just guiding us—they're about to run our apps, automate our workflows, and transform the very way we interact with technology.

The New Era: GPT in Photoshop and Premiere Pro

The future isn't coming—it's here—nearly. Adobe has just rolled out generative AI feature demonstrations for both Photoshop and Premiere Pro, powered by models like GPT and Firefly. In Photoshop, you can now talk to your computer, describe the image you want, and watch as the AI creates and edits documents for you—all via natural conversation. Over in Premiere Pro, editors are using generative tools to instantly extend video clips, remove or add objects, and even translate captions into multiple languages, all with a few prompts. These aren't just incremental upgrades—they're a paradigm shift in creative work.

Beyond Plugins: LLMs as Universal App Agents

What's truly groundbreaking isn't just smarter features inside individual apps. It's that LLMs are poised to become universal agents, capable of interacting with any software—legacy or cloud, desktop or web—by controlling the screen, keyboard, and mouse just like a human user. Imagine asking your AI to "summarize this PDF," "create a pivot table in Excel," or "batch-edit these images," and watching as it navigates menus, clicks buttons, and types commands across multiple apps, all without custom scripting.

Why is this possible now?

Vision-Language Understanding: LLMs can "see" screenshots, interpret UI elements, and reason about what action to take next.

Pre-training on Manuals & Forums: They've absorbed decades of user guides, support threads, and interface patterns, so they know not just what buttons do, but how people actually use software for the better.

APIs for Screen Control: Platforms like Claude and Microsoft Copilot Studio now let AI agents perform mouse clicks, text entry, and window navigation across any application.

The End of Apps? Rethinking the Operating System

This shift is so profound that it may spell the end of the traditional "app" as we know it. Instead of launching separate programs, you'll interact with a single AI-powered OS that deploys agents to handle tasks on your behalf. Need to book travel, analyze data, or design a presentation? Just ask—the AI will orchestrate the workflow, hopping between tools and services as needed, seamlessly.

Multimodal Interaction and Accessibility

While text and vision capabilities are impressive, the real breakthrough may be in multimodal interfaces that combine voice, text, gestures, and visual recognition. These AI agents are not just powerful—they're making complex software a doddle. Users can easily execute complex actions through conversation and can rely on AI-generated descriptions and assistance. This democratization of software access means complex creative and productivity tools are becoming available to a much wider audience, potentially unleashing talent that was previously limited by convoluted and unfriendly interface designs.

Personalization and the End of Learning Curves

Unlike static help systems or generalized tutorials, today's AI agents observe how you work. They identify your habits, remember your preferences, and adapt their assistance accordingly. For complex software like Premiere Pro, this means the days of intimidating learning curves may be numbered. Beginners get the guidance they need without wading through manuals, while power users receive increasingly sophisticated automation tailored to their unique workflows. The software effectively grows with you, becoming more valuable over time.

What's In It for Users and Enterprises?

Seamless Onboarding: Any software can be automated or explained instantly, without custom connectors or training.

Reduced Maintenance: Agents adapt visually to UI changes, minimizing breakage when interfaces update.

Enhanced Discoverability: Users can unlock hidden features with natural language, not by digging through menus.

Unified Automation: One LLM agent can bridge legacy tools, modern web apps, and everything in between.

Cognitive Load Reduction: By handling the mechanics of software operation, AI agents free users to focus on their creative or analytical goals rather than memorizing complex command sequences or menu hierarchies.

Enterprise Adoption: Beyond Technical Challenges

For enterprises, adopting AI agents involves more than technical integration. Organizations face significant challenges in aligning these technologies with existing workflows, ensuring compliance with industry regulations, and establishing governance frameworks. The most successful implementations will likely be those that address knowledge transfer between AI systems and human teams, create clear accountability structures, and develop metrics for measuring the true productivity impact. Forward-thinking companies are already establishing "AI orchestration" roles to manage these transformations.

Economic Impact and Workforce Evolution

As AI agents take over routine software tasks, job roles are evolving in response. We'll start seeing the emergence of "AI orchestrators"—professionals who specialize in directing these agents and optimizing their performance. This represents a shift from technical software proficiency to strategic AI direction, potentially a more creative and higher-value role. Far from replacing human workers, these agents are creating new categories of work focused on collaboration with AI systems.

The Human-AI Collaborative Model

The most effective AI agent implementations follow a collaborative model where humans and AI consistently play to each other's strengths. AI handles repetitive tasks, pattern recognition, and information retrieval, while humans provide creative direction, ethical judgment, and contextual understanding. This partnership model is proving more powerful than either humans or AI working independently, suggesting that the future belongs not to AI alone, but to those who master the art of human-AI collaboration.

The Challenges Ahead

Security: Granting AI agents screen and keyboard access raises the stakes for privacy and data protection. Strict consent and audit trails are a must.

Performance: Running vision-language models locally vs. in the cloud involves trade-offs in speed, cost, and control.

Error Recovery: Agents must recognize when actions fail and recover gracefully, keeping users in the loop.

Open Standards: For AI agents to truly become universal, open standards like the Multi-system Command Protocol (MCP) are essential. These standards will prevent vendor lock-in and foster a diverse ecosystem of specialized agents that can work together seamlessly.

Looking Forward: The Universal Agent Interface

As LLMs continue to ingest manuals, forums, and real-world workflows, they'll soon be able to fetch and apply procedural knowledge on demand—surfacing tutorials, compliance checks, and automations across any OS or app. This unified, agentic ecosystem transforms software from isolated tools into an interconnected, intelligent environment—one where AI doesn't just help, but does.

The bottom line: We're entering an era where software is no longer a collection of isolated tools, but a unified, agentic ecosystem—one where AI doesn't just help, but does.

From Balloon Help to AI Agents: How GPT Is Rewriting the Rules of Software Interaction

Mark Murphy

Driving Hybrid Digital Transformation | All things AI

The New Era: GPT in Photoshop and Premiere Pro

Beyond Plugins: LLMs as Universal App Agents

Why is this possible now?

The End of Apps? Rethinking the Operating System

Multimodal Interaction and Accessibility

Personalization and the End of Learning Curves

What's In It for Users and Enterprises?

Enterprise Adoption: Beyond Technical Challenges

Economic Impact and Workforce Evolution

The Human-AI Collaborative Model

The Challenges Ahead

Looking Forward: The Universal Agent Interface

More articles by this author

Others also viewed

LLM Deployment 101: Which Method Should You Use and When?

Generative AI Frameworks Every AI/ML Engineer Should Know!

Top 10 AI Actions in Power Automate to Supercharge Your Workflows (In-Depth)

June 2025

AI New Hotness Newsletter for March 31st, 2025

Building AI-Driven Apps: Key Technologies & Challenges

Solving Web Agent Challenges with Symbiotic AI: A Deep Dive into AgentSymbiotic

Supercharge Your Testing: 5 Free Cypress AI Tools That Actually Work

AI Weekly Recap: April 6 to 12, 2025

NewMind AI Journal #42

Explore topics

The New Era: GPT in Photoshop and Premiere Pro

Beyond Plugins: LLMs as Universal App Agents

Why is this possible now?

The End of Apps? Rethinking the Operating System

Multimodal Interaction and Accessibility

Personalization and the End of Learning Curves

What's In It for Users and Enterprises?

Enterprise Adoption: Beyond Technical Challenges

Economic Impact and Workforce Evolution

The Human-AI Collaborative Model

The Challenges Ahead

Looking Forward: The Universal Agent Interface

Linux Desktop Adoption Hits New High: What’s Really Driving the Shift?

Jul 23, 2025

Neuro-Symbolic AI Died in the 1980s - Now it's the Future for AI.

Jul 23, 2025

Windows 11 and macOS Are Dead, Long Live ChatGPT OS

Jul 18, 2025

AI vs. Digital Transformation: Core IT Challenges

Jul 1, 2025

Harnessing AI for White-Collar Work: An Optimistic Path Forward for Australia’s Workforce

Jul 1, 2025

Glass is Back: The Spotty Evolution of Translucent

Jun 5, 2025

From Abandoned Format to Complete Searchable Library Using AI

Jun 2, 2025

Bad to Good: Evolutionary Lessons for the Darwin Gödel Machine

Jun 2, 2025

The AI at the Edge Breakout Has Begun — But Who Owns Your Digital Persona?

Jun 1, 2025

OK Boomer - Why Gen X and Gen Z Might Have to Wait a Bit Longer to Rule the Kingdom

May 13, 2025

Others also viewed

LLM Deployment 101: Which Method Should You Use and When?

Generative AI Frameworks Every AI/ML Engineer Should Know!

Top 10 AI Actions in Power Automate to Supercharge Your Workflows (In-Depth)

June 2025

AI New Hotness Newsletter for March 31st, 2025

Building AI-Driven Apps: Key Technologies & Challenges

Solving Web Agent Challenges with Symbiotic AI: A Deep Dive into AgentSymbiotic

Supercharge Your Testing: 5 Free Cypress AI Tools That Actually Work

AI Weekly Recap: April 6 to 12, 2025

NewMind AI Journal #42

Explore topics