SlideShare a Scribd company logo
Enterprise Data and Analytics
Architecture Overview
for
Electric Utility
Prajesh Bhattacharya
enSustain
Copyright enSustain
Summary
Copyright enSustain
1
The Overall Architecture
Copyright enSustain
Data Warehouse
(ONLY data required for high
performing production
reporting)
Enterprise Nomenclature
Data Lake
(ALL data)
Enterprise Nomenclature
Data Translation Layer (MDMS/EI)
Data Translation Layer (MDMS/EI)
DMS & OMS
Customer Data
Smart Meter
(metadata, readings)
Asset Data
(Location, config)
Financial
Data
Data Historian
SCADA
DG metadata
DG generation
data
HR Data
Usual EDW
process and
structure
Usual EDW
process and
structure
Discovery
and indexing
Discovery and
indexing
Production
Reports
Projects
Data Explorers
(Engineers, Data
scientists)
Weather Data
Misc. Sensor
head ends
Security
Data
Transmission
Planning Data
Maintenance
Data
Demand
Response Data
Transmission OE
and Dispatch Data
EMS
Transmission
Market Data
IT Asset Ops Data
IT Support Data
Project
Documents
Marketing &
Sales Data
Catch-All Other
Applications
Email &
chat logs
Facility
Data
Fleet
DataCopyright enSustain
Possible Point-To-Point Exceptions
Purpose oriented connection. Example:
oHistorian facilitates connection with SCADA
o EMS  SCADA connection is latency-sensitive
Application requiring access to only one system
oDMS applications running off of DMS data
o Historian applications running off of Historian data
Copyright enSustain
2
Implementation
Copyright enSustain
The Approach for Implementation
Main Challenges
Siloed data
Solution Part 1:
Standard Data
Model
Solution Part 2:
Access to unified
data
Lack of analytics
ideas
Solution:
Close partnership
between IT and
business
Lack of budget
Solution Part 1:
Tax each new
project
Solution Part 2:
Take baby steps
Copyright enSustain
Necessary Condition for Success
• At the beginning, implement the new mechanism, ONLY to serve the new
requirements
• Keep the existing connections working and unaffected
• Eventually, some of the existing connections will be deemed not-required, by the
business
• The rest of the existing connections can be converted as part of application
maintenance/overhaul/upgrade, but not in the beginning phase of the initiative
Do NOT touch the existing and the working systems first
• Do not try to implement all the necessary new components at once.
• Good quality on small scope is better than mediocre quality on large scope.
• It might require more overhead, but it is often worth.
Scope the smallest possible piece and do it well
Copyright enSustain
Possible Steps for Implementation
A new data
connectivity
requirement
comes in
Identify the
source system
Define the
enterprise
nomenclature
for the source
system to align
with industry
standard
Load
MDMS/EI with
the dictionary
Configure EI to
act as the data
virtualization
layer for the
source system
Release for
production
use with
appropriate
support
mechanism
Milestone: One project is now using this new mechanism for one source system
Repeat 1 for every new
data connectivity
request
As more source systems
are brought into the
scope, resolve
discrepancies, if any
arises
The virtualization layer
might experience
performance issue as
data load increases
Research and Plan the
Data Lake
For every new data source
implementation for the
virtualization, implement
the corresponding ETL for
Data Lake
Open the Data lake to
users that prefer getting
their data from the Data
Lake (delayed but faster)
over virtualization
Implement Data Lake
Analytics (say ML based
on Spark) for a single use
case
Copyright enSustain
3
MDMS, EI, Data Virtualization
Copyright enSustain
Skip This Section
Most utilities already use these systems and are familiar with them
Copyright enSustain
Hence, we will not discuss them
4
Data Lake
Copyright enSustain
Why the Data Lake?
• Some of the SOR systems might not be capable of handling as
much data request
• Access to some of the SOR systems might not be practical
• Implementation of data quality check on virtualized data is
hard (at the least, it would slow down queries)
• Data travel over network: larger in a virtualized environment
than in a Data Lake designed and used in a specific way
• Bottom line: go for Data Lake only if it is foreseen to be needed
If the MDMS/EI layer virtualizes the data, then access to standardized data across the
enterprise is already established.
What additional value does the Data Lake bring?
Data Lake – not the immediate need, but the eventual destination
Copyright enSustain
Data Lake: Market Offering Landscape
Copyright enSustain
Data Lake: Feature Landscape
Copyright enSustain
• Coming up …
How Does ETL Look Like for Data Lake?
• Coming up …
Copyright enSustain
5
Analytics
Copyright enSustain
Taxonomy 3
Taxonomy 2
Taxonomy 1
The Analytics Tool Landscape
Analytics
Tools
Production
Data Write-
back
Read-only
Project (semi-
production)
Data Write-
back
Read-only
Ad-hoc
Data Write-
back
Read-only
Analytics Tools
Managed
(Server based)
Unmanaged
(Desktop based)
Analytics
Tools
Coding heavy
Configuration
heavy
Copyright enSustain
Analytics Opportunities …
• Coming up …
Copyright enSustain
6
Appendix
Copyright enSustain
References
• http://ceur-ws.org/Vol-1497/PoEM2015_ShortPaper4.pdf
• http://smartgrid.epri.com/doc/Utility%20Enterprise%20Architecture%20Best%20Practices%20-
%20webcast.pdf
• http://www.navigantresearch.com/wordpress/wp-content/uploads/2011/10/SGEA-11-Brochure.pdf
• http://www.gridwiseac.org/pdfs/forum_papers/114_127_paper_final.pdf
• http://www.iec.ch/smartgrid/standards/
• https://www.boozallen.com/content/dam/boozallen/documents/Data_Lake.pdf
• Data Warehousing in the Age of Big Data, Krish Krishnan
• https://es.slideshare.net/hortonworks/hortonworks-and-waterline-data-webinar
• http://www.ibmbigdatahub.com/blog/charting-data-lake-rethinking-data-models-data-lakes
• https://www.slideshare.net/fabien_gandon/ontologies-in-computer-science-and-on-the-web
• http://www.ibmbigdatahub.com/blog/charting-data-lake-rethinking-data-models-data-lakes
• https://upside.tdwi.org/articles/2016/03/23/data-lake-become-swamp-1.aspx
• Many other sources
• Indigenous experiments
• Real-world experience
Copyright enSustain

More Related Content

PDF
2020 Cloud Data Lake Platforms Buyers Guide - White paper | Qubole
PPT
Data Federation
PPTX
Part 1: Introducing the Cloudera Data Science Workbench
PPTX
Part 3: Models in Production: A Look From Beginning to End
PPTX
Keynote: The Journey to Pervasive Analytics
PPTX
Breakout: Operational Analytics with Hadoop
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r1)
PPTX
Apache Kudu: Technical Deep Dive


2020 Cloud Data Lake Platforms Buyers Guide - White paper | Qubole
Data Federation
Part 1: Introducing the Cloudera Data Science Workbench
Part 3: Models in Production: A Look From Beginning to End
Keynote: The Journey to Pervasive Analytics
Breakout: Operational Analytics with Hadoop
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Apache Kudu: Technical Deep Dive



What's hot (20)

PPTX
Kudu Forrester Webinar
PPTX
5 Things that Make Hadoop a Game Changer
PPTX
Analyzing Hadoop Data Using Sparklyr

PPTX
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
PPTX
How Data Drives Business at Choice Hotels
PPTX
Supercharge Splunk with Cloudera

PPTX
Full stack monitoring across apps & infrastructure with Azure Monitor
PPTX
Breaking the Silos: Storage for Analytics & AI
PPTX
Consolidate your data marts for fast, flexible analytics 5.24.18
PPTX
Transforming Insurance Analytics with Big Data and Automated Machine Learning

PPTX
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
PPTX
Advanced Analytics for Investment Firms and Machine Learning
PDF
Lecture4 big data technology foundations
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
PPTX
Piranha vs. mammoth predator appliances that chew up big data
PDF
Cloud Storage Spring Cleaning: A Treasure Hunt
PPTX
Big data journey to the cloud maz chaudhri 5.30.18
PPTX
Developing a Strategy for Data Lake Governance
PDF
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
Kudu Forrester Webinar
5 Things that Make Hadoop a Game Changer
Analyzing Hadoop Data Using Sparklyr

Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
How Data Drives Business at Choice Hotels
Supercharge Splunk with Cloudera

Full stack monitoring across apps & infrastructure with Azure Monitor
Breaking the Silos: Storage for Analytics & AI
Consolidate your data marts for fast, flexible analytics 5.24.18
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Advanced Analytics for Investment Firms and Machine Learning
Lecture4 big data technology foundations
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Piranha vs. mammoth predator appliances that chew up big data
Cloud Storage Spring Cleaning: A Treasure Hunt
Big data journey to the cloud maz chaudhri 5.30.18
Developing a Strategy for Data Lake Governance
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
Ad

Viewers also liked (7)

PDF
How to Become a Thought Leader in Your Niche
PDF
SAP HANA 1.0 Solutions Overview (A Practical Approach for Utility and CPG)
PDF
2013 Vendor Track, GE Digital Energy Solutions Overview by John Chisum
PPT
Demand forecasting 12
PDF
AWS Lambda
PDF
Beyond Relational
PPTX
Demand forecasting ppt
How to Become a Thought Leader in Your Niche
SAP HANA 1.0 Solutions Overview (A Practical Approach for Utility and CPG)
2013 Vendor Track, GE Digital Energy Solutions Overview by John Chisum
Demand forecasting 12
AWS Lambda
Beyond Relational
Demand forecasting ppt
Ad

Similar to DRAFT - Enterprise Data and Analytics Architecture Overview for Electric Utility (20)

PPTX
Enterprise Data and Analytics Architecture Overview for Electric Utility
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r2)
PDF
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
PPTX
Data Mesh using Microsoft Fabric
PDF
Data Driven Advanced Analytics using Denodo Platform on AWS
PDF
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
PDF
Data Virtualization: An Essential Component of a Cloud Data Lake
PDF
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
PDF
Dell Digital Transformation Through AI and Data Analytics Webinar
PDF
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
PPTX
Data lake-itweekend-sharif university-vahid amiry
PDF
Modern Data Management for Federal Modernization
PDF
PDF
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
PPTX
Simplifying Real-Time Architectures for IoT with Apache Kudu
PPTX
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
PDF
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
PDF
5 Steps for Architecting a Data Lake
PDF
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
PDF
Big data analysis concepts and references
Enterprise Data and Analytics Architecture Overview for Electric Utility
Data Lakehouse, Data Mesh, and Data Fabric (r2)
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Data Mesh using Microsoft Fabric
Data Driven Advanced Analytics using Denodo Platform on AWS
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Data Virtualization: An Essential Component of a Cloud Data Lake
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
Dell Digital Transformation Through AI and Data Analytics Webinar
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
Data lake-itweekend-sharif university-vahid amiry
Modern Data Management for Federal Modernization
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
Simplifying Real-Time Architectures for IoT with Apache Kudu
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
5 Steps for Architecting a Data Lake
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Big data analysis concepts and references

Recently uploaded (20)

PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Introduction to machine learning and Linear Models
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
Lecture1 pattern recognition............
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
A Quantitative-WPS Office.pptx research study
PPTX
Supervised vs unsupervised machine learning algorithms
STUDY DESIGN details- Lt Col Maksud (21).pptx
Miokarditis (Inflamasi pada Otot Jantung)
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Reliability_Chapter_ presentation 1221.5784
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Business Acumen Training GuidePresentation.pptx
Moving the Public Sector (Government) to a Digital Adoption
climate analysis of Dhaka ,Banglades.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction to machine learning and Linear Models
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
Lecture1 pattern recognition............
Introduction-to-Cloud-ComputingFinal.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
.pdf is not working space design for the following data for the following dat...
A Quantitative-WPS Office.pptx research study
Supervised vs unsupervised machine learning algorithms

DRAFT - Enterprise Data and Analytics Architecture Overview for Electric Utility

  • 1. Enterprise Data and Analytics Architecture Overview for Electric Utility Prajesh Bhattacharya enSustain Copyright enSustain
  • 4. Data Warehouse (ONLY data required for high performing production reporting) Enterprise Nomenclature Data Lake (ALL data) Enterprise Nomenclature Data Translation Layer (MDMS/EI) Data Translation Layer (MDMS/EI) DMS & OMS Customer Data Smart Meter (metadata, readings) Asset Data (Location, config) Financial Data Data Historian SCADA DG metadata DG generation data HR Data Usual EDW process and structure Usual EDW process and structure Discovery and indexing Discovery and indexing Production Reports Projects Data Explorers (Engineers, Data scientists) Weather Data Misc. Sensor head ends Security Data Transmission Planning Data Maintenance Data Demand Response Data Transmission OE and Dispatch Data EMS Transmission Market Data IT Asset Ops Data IT Support Data Project Documents Marketing & Sales Data Catch-All Other Applications Email & chat logs Facility Data Fleet DataCopyright enSustain
  • 5. Possible Point-To-Point Exceptions Purpose oriented connection. Example: oHistorian facilitates connection with SCADA o EMS  SCADA connection is latency-sensitive Application requiring access to only one system oDMS applications running off of DMS data o Historian applications running off of Historian data Copyright enSustain
  • 7. The Approach for Implementation Main Challenges Siloed data Solution Part 1: Standard Data Model Solution Part 2: Access to unified data Lack of analytics ideas Solution: Close partnership between IT and business Lack of budget Solution Part 1: Tax each new project Solution Part 2: Take baby steps Copyright enSustain
  • 8. Necessary Condition for Success • At the beginning, implement the new mechanism, ONLY to serve the new requirements • Keep the existing connections working and unaffected • Eventually, some of the existing connections will be deemed not-required, by the business • The rest of the existing connections can be converted as part of application maintenance/overhaul/upgrade, but not in the beginning phase of the initiative Do NOT touch the existing and the working systems first • Do not try to implement all the necessary new components at once. • Good quality on small scope is better than mediocre quality on large scope. • It might require more overhead, but it is often worth. Scope the smallest possible piece and do it well Copyright enSustain
  • 9. Possible Steps for Implementation A new data connectivity requirement comes in Identify the source system Define the enterprise nomenclature for the source system to align with industry standard Load MDMS/EI with the dictionary Configure EI to act as the data virtualization layer for the source system Release for production use with appropriate support mechanism Milestone: One project is now using this new mechanism for one source system Repeat 1 for every new data connectivity request As more source systems are brought into the scope, resolve discrepancies, if any arises The virtualization layer might experience performance issue as data load increases Research and Plan the Data Lake For every new data source implementation for the virtualization, implement the corresponding ETL for Data Lake Open the Data lake to users that prefer getting their data from the Data Lake (delayed but faster) over virtualization Implement Data Lake Analytics (say ML based on Spark) for a single use case Copyright enSustain
  • 10. 3 MDMS, EI, Data Virtualization Copyright enSustain
  • 11. Skip This Section Most utilities already use these systems and are familiar with them Copyright enSustain Hence, we will not discuss them
  • 13. Why the Data Lake? • Some of the SOR systems might not be capable of handling as much data request • Access to some of the SOR systems might not be practical • Implementation of data quality check on virtualized data is hard (at the least, it would slow down queries) • Data travel over network: larger in a virtualized environment than in a Data Lake designed and used in a specific way • Bottom line: go for Data Lake only if it is foreseen to be needed If the MDMS/EI layer virtualizes the data, then access to standardized data across the enterprise is already established. What additional value does the Data Lake bring? Data Lake – not the immediate need, but the eventual destination Copyright enSustain
  • 14. Data Lake: Market Offering Landscape Copyright enSustain
  • 15. Data Lake: Feature Landscape Copyright enSustain • Coming up …
  • 16. How Does ETL Look Like for Data Lake? • Coming up … Copyright enSustain
  • 18. Taxonomy 3 Taxonomy 2 Taxonomy 1 The Analytics Tool Landscape Analytics Tools Production Data Write- back Read-only Project (semi- production) Data Write- back Read-only Ad-hoc Data Write- back Read-only Analytics Tools Managed (Server based) Unmanaged (Desktop based) Analytics Tools Coding heavy Configuration heavy Copyright enSustain
  • 19. Analytics Opportunities … • Coming up … Copyright enSustain
  • 21. References • http://ceur-ws.org/Vol-1497/PoEM2015_ShortPaper4.pdf • http://smartgrid.epri.com/doc/Utility%20Enterprise%20Architecture%20Best%20Practices%20- %20webcast.pdf • http://www.navigantresearch.com/wordpress/wp-content/uploads/2011/10/SGEA-11-Brochure.pdf • http://www.gridwiseac.org/pdfs/forum_papers/114_127_paper_final.pdf • http://www.iec.ch/smartgrid/standards/ • https://www.boozallen.com/content/dam/boozallen/documents/Data_Lake.pdf • Data Warehousing in the Age of Big Data, Krish Krishnan • https://es.slideshare.net/hortonworks/hortonworks-and-waterline-data-webinar • http://www.ibmbigdatahub.com/blog/charting-data-lake-rethinking-data-models-data-lakes • https://www.slideshare.net/fabien_gandon/ontologies-in-computer-science-and-on-the-web • http://www.ibmbigdatahub.com/blog/charting-data-lake-rethinking-data-models-data-lakes • https://upside.tdwi.org/articles/2016/03/23/data-lake-become-swamp-1.aspx • Many other sources • Indigenous experiments • Real-world experience Copyright enSustain