How to operationalize unstructured data with Agentic AI and LLMs

3,339 followers

Unstructured data often contains the most distinct and proprietary content that remains untapped within enterprises. When operationalized through Agentic AI workflows and Large Language Models (LLMs), this data becomes actionable, driving tangible business value. Yet, the challenge lies in managing the inherent complexity of unstructured data at scale. Piethein Strengholt highlights how the Medallion Architecture, traditionally focused on structured data, can be repurposed into a unified, layered framework designed to handle unstructured data effectively. He discusses how: ✅ Bronze, Silver, and Gold layers can be extended to ingest, validate, and contextualize unstructured data. ✅ LLMs and RAG patterns can transform raw documents into reliable, AI-ready inputs. ✅ Governance and new roles (context engineers, value engineers) will be essential to translate unstructured data into business value. Click on the link below to learn more https://lnkd.in/gwxcjgKR #UnstructuredData #MedallionArchitecture #DataQuality #AIready #Datatrust #DataArchitecture

1 Comment

To view or add a comment, sign in

More Relevant Posts

Mona Rakibe

Co-Founder and CEO @ Telmai | Data Observability | YCS21
2w
Report this post
I came across this article from Piethein Strengholt a few weeks ago, and have since brought this article up in multiple conversations. It’s such a simple yet powerful way to visualize the deep architectural underpinnings required for Agentic AI adoption. Often, discussions on this topic get lost in abstractions — but this framing makes it accessible while still capturing the complexity behind it. Highly recommend giving it a read if you’re exploring how to design data and system foundations for the Agentic AI era. #dataarchitecture #openarchitecture #lakehouse #dataobservability #dataquality #agenticworkflow
Telmai

3,339 followers
2w

Unstructured data often contains the most distinct and proprietary content that remains untapped within enterprises. When operationalized through Agentic AI workflows and Large Language Models (LLMs), this data becomes actionable, driving tangible business value. Yet, the challenge lies in managing the inherent complexity of unstructured data at scale. Piethein Strengholt highlights how the Medallion Architecture, traditionally focused on structured data, can be repurposed into a unified, layered framework designed to handle unstructured data effectively. He discusses how: ✅ Bronze, Silver, and Gold layers can be extended to ingest, validate, and contextualize unstructured data. ✅ LLMs and RAG patterns can transform raw documents into reliable, AI-ready inputs. ✅ Governance and new roles (context engineers, value engineers) will be essential to translate unstructured data into business value. Click on the link below to learn more https://lnkd.in/gwxcjgKR #UnstructuredData #MedallionArchitecture #DataQuality #AIready #Datatrust #DataArchitecture
Like Comment
To view or add a comment, sign in
Anthony Giza

Solutions Architect @ PathAI | AI/ML, Cloud Solutions
1mo
Report this post
https://ift.tt/WK3t5VB Data Engineering for AI-Native Architectures: Designing Scalable, Cost-Optimized Data Pipelines to Power GenAI, Agentic AI, and Real-Time Insights Editor's Note: The following is an article written for and published in DZone's 2025 Trend Report, Data Engineering: Scaling Intelligence With the Modern Data Stack. The data engineering landscape has undergone a fundamental transformation with a complete reimagining of how data flows through organizations. Traditional business intelligence (BI) pipelines were built for looking backward, answering questions like "How did we perform last quarter?" Today's AI-native architectures demand systems that can feed real-time insights to recommendation engines, provide context to large language models, and maintain the massive vector stores that power retrieval-augmented generation (RAG). https://ift.tt/Fdq0kRx Abhishek Gupta

https://ift.tt/Fdq0kRx

dz2cdn1.dzone.com
Like Comment
To view or add a comment, sign in
Jean-Pierre Palomba-Marin

Consultant Analyst in operational, strategic and prospective economic intelligence / Palomba Consulting Group
2w
Report this post
Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First Large Language Model (LLM) agents, acting on their users' behalf to manipulate and analyze data, are likely to become the dominant workload for data systems in the future…We argue that data systems need to adapt to more natively support agentic workloads. We take advantage of the characteristics of agentic speculation that we identify, i.e., scale, heterogeneity, redundancy, and steerability - to outline a number of new research opportunities for a new agent-first data systems architecture, ranging from new query interfaces, to new query processing techniques, to new agentic memory stores… https://lnkd.in/dwfME4yq

Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First arxiv.org
Like Comment
To view or add a comment, sign in
Sanjay Saxena

Principal Architect at EY
2w
Report this post
🚨 Data architecture has a problem. Data Lakes → became dump yards. Warehouses → rigid & expensive. Data Mesh → great in theory, but demands cultural change few orgs are ready for. Data Fabric → automates a lot, but still metadata-heavy. 👉 The result? Costly, fragmented, hard-to-govern ecosystems. 🔥 The next evolution: AI-Native Data Architecture 1️⃣ Hybrid Lakehouse → unified foundation. 2️⃣ AI-Driven Fabric → auto-discovery, lineage, quality, policy enforcement. 3️⃣ Cognitive Layer → AI agents that build products, heal pipelines, and optimize continuously. 4️⃣ Conversational Access → NL → SQL, semantic search, governance through chat. 💡 The shift: from manual wrangling → to AI-assisted, self-optimizing platforms. Not about replacing humans — it’s about freeing them to create value instead of firefighting. 🚀 The journey: Data Lake → Lakehouse → AI Fabric → Cognitive Platform. Do you see AI as the missing piece to fix data architecture — or just another layer of complexity? #DataArchitecture #Lakehouse #DataFabric #DataMesh #DataGovernance #AI #AIDriven #CognitiveAI #FutureOfData #DigitalTransformation
Like Comment
To view or add a comment, sign in
Towards Data Science

643,285 followers
1w Edited
Report this post
Harness the power of semantic entity resolution to transform your knowledge graph projects. Russell Jurney's new article details how language models can automate complex data processes, driving efficiency and innovation in data management.

The Rise of Semantic Entity Resolution | Towards Data Science https://towardsdatascience.com

2 Comments
Like Comment
To view or add a comment, sign in
Matteo Coloberti

AI & Data Analytics - Building the corporate Data Culture
2w
Report this post
It was an honor and a fantastic opportunity to host my friend Francesco Puppini for an online session on his Unified Star Schema (USS) approach. His perspective was a powerful reminder that fundamentals are crucial to developing advanced data processing capabilities. Here are my top takeaways: Data Modeling is King 👑: Great data modeling remains the absolute bedrock of successful self-service BI. Without a solid, intuitive structure, even the most powerful tools will fall short. Curbing Data Proliferation: The USS approach he invented with Bill Inmon presents a brilliant strategy for reducing data proliferation. By creating a unified, non-redundant layer, we can finally tame the chaos of countless data marts and tables. Unlocking Agentic AI 🤖: USS is a key enabler for Agentic AI because it drastically reduces the dependency on complex, multi-join SQL to aggregate data. A simpler semantic layer means AI agents can more easily understand and query data, leading to faster, more reliable insights. Harmony with Kimball: The Unified Star Schema isn't here to replace everything we know. It can coexist perfectly with traditional Kimball dimensional modeling, allowing for a flexible and hybrid approach to data architecture. In the age of AI, is data modeling still relevant? After an insightful presentation the answer is a resounding YES, now more than ever! 🧠 A huge thank you to Francesco for sharing his invaluable knowledge! The path to a truly data-driven, AI-enabled organization is paved with smart, simple, and scalable data models. #DataModeling #UnifiedStarSchema #SelfServiceBI #BI #DataAnalytics #AI #AgenticAI #DataStrategy #DataArchitecture
2 Comments
Like Comment
To view or add a comment, sign in
Francesco Puppini

Author, professor, keynote speaker, data modeling innovator, R&D engineer
1w
Report this post
The Unified Star Schema belongs to everyone. Period. #unifiedstarschema #theunifiedstarschema #selfservice #LLMs #AI #datamesh #rethinkdata
Matteo Coloberti

AI & Data Analytics - Building the corporate Data Culture
2w

It was an honor and a fantastic opportunity to host my friend Francesco Puppini for an online session on his Unified Star Schema (USS) approach. His perspective was a powerful reminder that fundamentals are crucial to developing advanced data processing capabilities. Here are my top takeaways: Data Modeling is King 👑: Great data modeling remains the absolute bedrock of successful self-service BI. Without a solid, intuitive structure, even the most powerful tools will fall short. Curbing Data Proliferation: The USS approach he invented with Bill Inmon presents a brilliant strategy for reducing data proliferation. By creating a unified, non-redundant layer, we can finally tame the chaos of countless data marts and tables. Unlocking Agentic AI 🤖: USS is a key enabler for Agentic AI because it drastically reduces the dependency on complex, multi-join SQL to aggregate data. A simpler semantic layer means AI agents can more easily understand and query data, leading to faster, more reliable insights. Harmony with Kimball: The Unified Star Schema isn't here to replace everything we know. It can coexist perfectly with traditional Kimball dimensional modeling, allowing for a flexible and hybrid approach to data architecture. In the age of AI, is data modeling still relevant? After an insightful presentation the answer is a resounding YES, now more than ever! 🧠 A huge thank you to Francesco for sharing his invaluable knowledge! The path to a truly data-driven, AI-enabled organization is paved with smart, simple, and scalable data models. #DataModeling #UnifiedStarSchema #SelfServiceBI #BI #DataAnalytics #AI #AgenticAI #DataStrategy #DataArchitecture
1 Comment
Like Comment
To view or add a comment, sign in
Business Intelligence and Analytics Summit

156 followers
3w Edited
Report this post
Passive metadata is so passé. The future? Active metadata — the real-time, event-driven brain behind trustworthy AI and seamless data governance. Imagine your data stack not as a dusty archive, but a living, breathing organism that senses, adapts, and orchestrates every byte at the speed of business. With 65% of companies now running generative AI and regulations tightening every quarter, ignoring active metadata isn’t just risky—it’s reckless. For CEOs and data leaders: Stop documenting history and start architecting the future. In the AI era, metadata is your new secret weapon. https://lnkd.in/gUQx6xav

AI You Can Trust: Why Active Metadata Is Becoming the Backbone of Data Governance confxglobal.com
Like Comment
To view or add a comment, sign in
Joydeep Ghosh
1w
Report this post
🎯 𝗘𝘅𝘁𝗿𝗮𝗰𝘁 𝗕𝘂𝘀𝗶𝗻𝗲𝘀𝘀 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗳𝗿𝗼𝗺 𝗟𝗟𝗠-𝗯𝗮𝘀𝗲d 𝗦emantic Metadata Analysis If you're still crafting SQL to understand field meanings, you’re not alone. Many Data engineers continue to spend excessive time: → 𝙎𝙘a̶n̶n̶i̶n̶g schemas → Manually defining semantic models → Coding quality checks field by field That was static metadata. With agentic AI, things transform: ➡️ Schemas are identified automatically ➡️ Fields are categorized with business context ➡️ Initial rules (nulls, ranges, integrity) are applied immediately ➡️ Coverage updates dynamically in your business notebook It’s more than a map. It’s an intelligent, evolving context layer. ❇️ And here’s why it matters: 42% of enterprises extract data from over eight sources for AI workflows. Such complexity disrupts static metadata models. To construct reliable AI, you need metadata that acts—semantic context that evolves over time. #AgenticAI #DataManagement #DataQuality #DataObservability #AIReadyData #semanticmetadata
Like Comment
To view or add a comment, sign in
Deepyaman Datta

Building the future of composable, portable, Python-first pipelines
1w
Report this post
Don't be fooled by the name; the Boring Semantic Layer by Julien Hurault and Xorq Labs is one of the most exciting projects to watch! I've been toying with the idea of implementing a semantic layer on top of Kedro's Data Catalog. When I led data engineering teams at QuantumBlack, AI by McKinsey, one of the first activities we performed at any client was connecting and exploring their data. However, ad-hoc analyses didn't directly leverage the data models we'd painstakingly built. The Boring Semantic Layer provides built-in querying and charting capabilities on top of those data models and even exposes an MCP server for AI-driven answers! Everything is still very much a work in progress (I'm building on top of a bleeding-edge branch of Boring Semantic Layer without asking 🤫). One of the things I still don't know is whether people actually want to access (or even augment) their semantic layer from within their data pipelines or if it's mostly for BI/reporting. To what extent does a semantic layer replace feature engineering?
6 Comments
Like Comment
To view or add a comment, sign in

3,339 followers

View Profile Connect

LinkedIn respects your privacy

How to operationalize unstructured data with Agentic AI and LLMs

Explore content categories