Houston: We have a Path-to-Production Problem in AI
Reading time: ~12-14 minutes
Fifteen years. 500+ global organizations. Countless AI transformations. As an architect who has built and shipped numerous AI products, I've witnessed a frustratingly consistent pattern: the "AI Lab-to-Production Problem" - brilliant proof of concepts followed by... nothing.
The challenges are systemic: no data, bad data, ground truth NOT, infrastructure bottlenecks, regulatory roadblocks, drifts, poor UX, adoption challenges, trust deficits, lack of day2+ ops setup, misaligned expectations—the list is long. These aren't technical hiccups; they're systemic organizational and strategic failures. Let me lay out the six key themes hindering AI deployments and offer actionable solutions.
TL;DR
In the rapidly evolving world of artificial intelligence (AI), enterprises will face six pivotal challenges forming a hierarchy of needs for successful AI rollout and implementation:
Data Foundations: Need for establishing high-quality, labeled, and governance-ready data ecosystems
Technology Infrastructure and Scalability: Overcoming infrastructure and performance bottlenecks central to AI deployment (training and inference time)
Operational Monitoring and Maintenance: Preventing silent system failures
Governance and Trust Oversight: Ensuring responsible, transparent AI deployment
User Experience and Workflow Alignment: Embedding AI seamlessly into workflows with an AI-native design system
Strategic Alignment and Leadership: Driving AI initiatives with clear leadership and vision
Key Takeaway: Successful AI implementation, including production rollout, isn't solely about technology; it requires a holistic, strategic approach addressing technical, organizational, and human factors. Each level of this "AI needs hierarchy" must be addressed.
Theme 1: Data Foundations
No oil in the well
"Data is the new oil." We've all heard it. But what happens when the well runs dry, or worse, is full of sludge? AI systems rely on high-quality, labeled data to train effectively and deliver reliable results. Yet, many enterprises operate in data deserts: datasets are unharvested, outdated, poorly labeled or tagged, siloed, or simply neglected. This is compounded by a lack of ground truth alignment—a shared understanding of "correct"—which undermines both training and evaluation.
A telecom provider’s outdated churn data ignored seasonal trends, leading to unreliable forecasts. A healthcare client underestimated the need for data labeling, expecting the AI platform to both intuitively grasp customer intent and provide intents out-of-the-box from raw interaction logs, resulting in a conversational AI system that frequently missed the mark and frustrated users. Another client's conversational AI hallucinated responses due to inadequate training data, driving up support tickets and customer frustration. Oversharing data and content resulted in legal and reputational risks, not to mention inadvertently revealing patterns that compromise privacy and IP.
Actionable Strategies
Implement dynamic data enrichment and labeling pipelines
Establish ground truth governance
Create continuously curated data corpora
Develop comprehensive data security and privacy frameworks
Continuously curated and ensure the corpora is current
Implement modern data labeling techniques incl feedback harvesting
Implement guardrails for data security and privacy
Incorporate DataOps Discipline as part of a broader MLOps strategy
Attention to data security and privacy, ownership and IP Management
Future State: With these strategies implemented, a continuously updated, robustly enriched, and contextually self-verified data corpus ensures accurate, up-to-date responses for evolving business needs. Churn prediction models incorporate all relevant causal features, providing actionable insights. Legal and reputational risks from data oversharing are mitigated.
Theme 2: Technology Infrastructure and Scalability
Infrastructure That Can’t Keep Up
Even brilliant AI systems can falter when deployed at scale. LLM and RAG-based knowledge assistants, with their large context windows, demand significant computational resources to manage throughput, minimize latency, handle real-time requests, and deliver relevant outputs. Classic AI models aren't immune; they also face scaling challenges—both during training and inference. Multi-agent AI systems, where autonomous agents collaborate, require scalable infrastructure for seamless performance.
An e-commerce recommendation engine crashed under real-world traffic. Another client's generative AI knowledge assistant faced 45-second response times, leading to user abandonment. An AI-based contract analysis tool slowed dramatically when processing large document batches.
Actionable Strategies
Implement smarter token management incl caching and compression
Utilize co-located cloud resources
Develop load balancing techniques and provisioned throughput
Explore LLM quantization reduces computational overhead by simplifying numerical precision, enabling faster processing without compromising model accuracy)
Use alternative inferencing techniques
Adopt LLMOps frameworks to optimize LLM performance
Include adversarial testing during development to identify vulnerabilities
Future State: With a scalable infrastructure, AI systems deliver instant recommendations, streamline knowledge queries, and enable multi-agent collaboration, driving user satisfaction and better decisions..
NOTE: I have talked about the LLM Performance Triad here (1) Is Your AI Slowing Down? Here’s How to Optimize Latency, Throughput, and Rate in LLMs | LinkedIn
Theme 3: Operational Monitoring and Maintenance
The Silently Failing Systems
The question isn’t just “Do you know what your AI is doing at 1 AM?” anymore. It’s “Can you predict what your AI will do at 1 AM?”. Once in production, AI systems are prone to silent failures—bias creeping into outputs, models making flawed decisions, performance degrading, or retrieval relevance deteriorating. Without real-time monitoring or well-defined (and relevant) performance benchmarks, these issues often go unnoticed until it’s too late, eroding trust and diminishing the value of AI investments. Many enterprises neglect to establish critical performance metrics like accuracy, recall, F1 scores, truthfulness, or precision, leaving their systems vulnerable to drift. As models age, outputs increasingly deviate from ground truth, exacerbating the risk of failure and reducing confidence in AI-driven decisions.
A predictive maintenance model at an auto upholstery manufacturer failing to detect equipment issues resulting in a expensive production halt. A telecom network fault prediction system missing critical traffic pattern changes. An NLP model for customer support ticket routing began misclassifying tickets after a product update, increasing customer wait times due to a lack of automated retraining.
Actionable Strategies
Establish baseline performance eval metrics
Implement continuous monitoring incl alerting rules with thresholds
MLOps pipeline for model and data drift detection
Create automated, governed model retraining processes
Implement robust observability (incl logging and traceability)
Leverage feedback data and establish governed model retraining
Proactive red teaming to identify vulnerabilities and resilience
KPMG AI System Cards: Continuously updated scorecards for the AI systems including performance metrics, drift alerts, and retraining schedules.
Future State: With robust monitoring and maintenance in place, AI systems maintain consistent performance and deliver ongoing value. Unexpected equipment downtime is minimized. Network outages are predicted and prevented. Customer support tickets are routed efficiently, improving customer satisfaction and reducing operational costs.
Theme 4: Governance and Trust Oversight
Trust Isn’t Optional
AI systems can amplify bias. Generative AI systems can generate inaccurate information (hallucinations) or violate privacy regulations. These risks are amplified with agentic AI. Contributing factors include a lack of transparency or explainability, regulatory non-compliance, inadequate auditing mechanisms, a lack of observability and monitoring, and the absence of AI system evaluations. Ensuring responsible AI—and mitigating these risks—requires robust governance, continuous audits, and ethical safeguards. This is crucial for building trust.
I have witnessed financial models inadvertently discriminating against minority applicants. Algorithmic trading systems without proper safeguards and “kill switches”. Generative AI producing culturally insensitive content in marketing collateral.
These examples highlight the need for proactive governance across various areas: bias, data governance, IP, agent accountability, human-AI collaboration, value alignment, auditing, security, environmental impact, and stakeholder engagement.
Actionable Strategies:
Implement comprehensive fairness evaluations
Incorporate explainability and interpretability techniques in every AI system
Create robust observability and logging
Establish human-in-the-loop and human-on-the-loop governance
Review and adopt KPMG’s Trusted AI Framework
Adopt KPMG AI System Cards provide a structured framework to document and evaluate AI systems, ensuring transparency, accountability, and continuous improvement
Implement governance for Agents
RLHF Pipelines: Continuously refine generative outputs with structured human feedback
Include adversarial testing as part of regular audits
Setup teams to perform threat modeling incl purple teaming
Future State: With these strategies implemented, underpinned by the principles of the KPMG Trusted AI Framework, AI systems operate fairly, transparently, and responsibly, building trust. Financial models make unbiased decisions and comply with relevant regulations. Trading systems have robust fail-safes to prevent catastrophic losses. Marketing content is ethically generated, enhancing brand reputation. Robust governance mitigates risks and ensures compliance.
Theme 5: User Experience and Workflow Alignment
AI Out of Sight, Out of Mind
Even the most sophisticated AI systems are useless if users don't trust or understand them. Many AI tools exist in isolation, failing to embed into daily workflows and leaving outputs disconnected from actual decision-making. This "outsider syndrome" leads to poor adoption and wasted investments. Poor UX design further compounds the problem, confusing users and eroding trust. General-purpose LLM interfaces, while useful for brainstorming, often fall short when it comes to solving domain specific, in-context problems within existing workflows. Resistance to change and inadequate training further exacerbate these issues, creating a perfect storm for AI project failure.
Sales teams ignoring lead scoring tools due to lack of context, usability and explainability. Finance teams abandoning AI recommendations due to clunky and siloed interfaces. AI-driven inventory management system required users to switch between multiple applications, adding complexity rather than streamlining their work. AI-powered report generation tool produced complex narratives incl charts and graphs.
Actionable Strategies
Design user-centric, AI-native experiences
Seamlessly integrate AI into existing workflows
Implement human-in-the-loop interventions
Deploy guardrails to align AI outputs with user workflows
Develop comprehensive training programs
Implement maniacal user tracking and feedback capture
Future State: When AI is seamlessly integrated into user workflows with intuitive UX, adoption soars. Sales teams understand and trust AI-generated lead scores, leading to higher conversion rates. Finance teams confidently use AI-driven recommendations to optimize the order-to-cash process. Inventory management becomes more efficient, reducing costs and stockouts. Reports are easily understood, empowering data-driven decision-making.
NOTE: I have talked UX for AI about here Revolutionizing User Experience: How AI is Crafting a Seamless Design Layer Across Platforms | LinkedIn
Theme 6: Strategic Alignment and Leadership
The Vision Vacuum
This is the most frequent root cause of AI project failure: a lack of strategic alignment and leadership buy-in. Even the most technically brilliant AI initiatives will falter without a clear vision, dedicated resources, and executive sponsorship. Without these foundational elements, projects become isolated experiments, failing to deliver measurable business value and ultimately fading away. This isn't a technical problem; it's a leadership problem.
A client invested heavily in AI without clear objectives, leading to siloed projects, duplicated spending, and no measurable impact. A flagship AI initiative failed due to lack of sustained executive sponsorship. At another healthcare client, leadership had unrealistic expectations about the speed and ease of AI implementation, not to mention the immediate payback they expected from a paltry MVP rollout, setting the stage for disappointment and disillusionment when initial results didn't meet their inflated projections.
Actionable Strategies
• Appoint an AI Czar/Executive Sponsor: Send a clear top-down message
• Develop a clear AI strategy and roadmap
• Establish an AI Center of Excellence
• Implement realistic value realization frameworks
• Manage expectations and communicate transparently
• Prioritize people, skills, and change management
• Size and estimate AI efforts pragmatically
Future State: With visionary leadership and a well-defined AI strategy, organizations undergo a true transformation, becoming data-driven and AI-powered. Siloed projects are replaced by cohesive, enterprise-wide initiatives that deliver synergistic value. AI is seamlessly integrated into business processes, empowering employees, optimizing operations, and creating new opportunities for growth. A strong AI culture fosters innovation, experimentation, and continuous improvement.
Conclusion: Building Trusted, Production-Ready AI: Mission Accomplished
Building production-ready AI is complex, but the rewards are transformative. We began by acknowledging the "Houston, we have a problem" reality of AI deployments - the common pitfalls that hinder progress. Enterprises by addressing these six critical themes can move can move beyond proof-of-concept paralysis to operationalize reliable, scalable, and compliant AI. But success must be measurable beyond surface-level KPIs—organizations should track tangible outcomes like efficiency gains, cost savings, revenue growth, customer (and employee) experience score, and innovation impact to showcase ROI that resonates across stakeholders. The time to act is now.
Cyber Defense Operations - Senior Manager at Accenture
7moVery detailed, truly hands on the current issues..nice article Swami.
Global AI Program Delivery | Customer Experience Strategy and Digital Transformation | Brand & Marketing at KPMG
7moGreat article, thank you Swami. One of the speakers at last week’s Microsoft AI Tour shared that, as a country, Australia is the one of the earliest and highest adopters of AI but we are also the least trusting and most skeptical of AI. So the Trusted piece will be even more important here.
Driving ROI from Gen AI & Cloud │ VP / Director – Solution Architecture & Practice Leadership │ Tech Sales & GCC Leader │ $100M+ P&L │ $B+ Bookings │ $400M ARR Impact │ Ex-AWS / IBM | KPMG Partner
7moSwami, Excellent article on the gaps in realising value from AI investments and how to some that. I especially liked the Eval part!
AI Strategy & GTM | Responsible AI | ESG Tech | Healthcare AI
7moInsightful post Swami, thanks for sharing. The points on Trust, UX and Leadership especially hit hard.