SlideShare a Scribd company logo
Paper Presentation
on
Challenges of Big Data to Big Data Mining
with their Processing Framework
Kamlesh Kumar Pandey
Dept. of Computer Science & Applications
Dr. Hari Singh Gour Vishwavidyalaya, Sagar, M.P
E-mail: kamleshamk@gmail.com
International Conference on Communication Systems and Network Technologies 2018
Content
• Big Data
• Big Data Mining
• Data challenges
• Process challenges
• Management Challenges
• Big Data Mining Processing Framework
BIG DATA
• Diebold et Al. (2000) is a first writer who discussed the word Big Data
in his research paper. All of these authors define Big Data there means
if the data set is large then gigabyte then these type of data set is
known as Big Data.
• Doug Laney et al (2001) was the first person who gave a proper
definition for Big Data. He gave three characteristics Volume, Variety,
and Velocity of Big Data and these characteristics known as 3 V’s of
Big Data Management. Basically, these 3 V’s describe the framework
of Big Data.
• Gartner (2012), “Big data is high-volume, high-velocity and high-
variety information assets that demand cost-effective, innovative
forms of information processing for enhanced insight and decision
making”
BIG DATA V’s
• In present time seven V’s used for Big Data where the first three V’s Volume,
Variety, and Velocity are the main characteristics of big data. In addition to
Variability, Value, Veracity, and Visualization are depending on the
organization.
BIG DATA MINING
• Big Data Mining fetching on the requested information, uncovering
hidden relationship or patterns or extracting for the needed
information or knowledge from a dataset these datasets have to meet
three V’s of Big Data with higher complexity.
CHALLENGES OF BIG DATA MINING
• Data challenges,
• Process challenges
• Management challenges
• Data challenges are based on the basic characteristics such as volume,
variety, velocity, veracity etc. of the Big Data. These type of challenges differ
from traditional data characteristics.
• Process challenges are based on the technique for data mining, data
processing or analysis in which algorithms are used to mining or analysis,
integration, transform, preprocessing on data etc.
• Management challenges are cover to data management related challenges like
privacy, security, governance, and other aspects.
DATA CHALLENGES
• Roberto V. Zicari et al. (2014) and Uthayasankar Sivarajah et al.
(2017) are categorizing data challenges in seven categories.
• Volume
• Variety
• Velocity
• Variability
• Value
• Veracity
• Visualization
PROCESS CHALLENGES
• Kaisler et al. (2013) and Uthayasankar Sivarajah et al. (2017) identify
data processing related challenges that can be classify into five steps for
data mining.
• Data acquisition and warehousing
• Data cleaning
• Data analysis and Mining
• Data integration and aggregation
• Data querying and indexing
MANAGEMENT CHALLENGES
• Uthayasankar Sivarajah et al (2017) has discussed various
Management challenges which are ensuring data are used correctly,
data access where used by only authorized person, without any
permission data are not accessible, which maintains privacy, given
higher security from external and internal attack, the proper way of
transformed and derived data etc.
• Privacy
• Security
• Data and information sharing
• Cost/operational expenditures
• Data ownership
BIG DATA MINING PROCESSING FRAMEWORK
• Wu Xindong et al. (2014) presents a HACE theorem and big data
processing model for big data mining process and challenges
perspective. This big data mining processing model cover to data and
management driven challenges.
References
• Fan Wei and Bifet Albert (2012): “Mining Big Data: Current Status, and Forecast to the Future”, ACM SIGKDD Explorations Newsletter, V-14, I-2, pp1-5.
• K.U. Jaseena and David M. (2014): “Issue Challenges and Solution: Big Data Mining”, Published in the Proc. Of SMTP-2014, Published By AIRCC Publishing Corporation, held in Chennai, India on 27-28
Dec 2014, pp 131-140.
• Landset Sara, Khoshgoftaar Taghi M, Richter Aaron N. and Hasanin Tawfiq(2015): “A survey of open source tools for machine learning with big data in the Hadoop ecosystem”, Journal of Big Data, 2:
24K. Elissa, “Title of paper if known,” unpublished.
• Sivarajah Uthayasankar and Mustafa Kamal Muhammad (2017): “Critical analysis of Big Data challenges and analytical methods”, Journal of Business Research (Elsevier), V-70, PP 263-286.
• Najafabadi Maryam M, Villanustre Flavio, Khoshgoftaar Taghi M, Seliya Naeem, Wald Randall and Muharemagic Edin (2015): “Deep learning applications and challenges in big data analytics”, Journal
of Big Data, 2:1.
• Bifet Albert, (2013), “Mining Big data in Real-time”, Informatica, V-37, I-1, PP 15-20.
• Che Dunren, Safran Mejdl and Peng Zhiyong (2013): “From Big Data to Big Data Mining: Challenges, Issues, and Opportunities”, Published in the Proc. Of International Conference on Database Systems
for Advanced Applications Organized & Published by Springer held in Suzhou, China in March 2017, PP 1 to 15.
• Gandomi Amir and Haider Murtaza(2015): “Beyond the hype: Big data concepts, methods, and analytics”, International Journal of Information Management, Published By Springer, V-35, PP 137 to
144.
• Pandey Kamlesh (2018),: “Mining on Relationship in Big Data era Using Apriori Algorithm”, Published in the Proc. Of National Conference on Data Analytics, Machine Learning and Security to be held on
15-16 February 2018 by Department of CSIT, GGV, Bilaspur, C.G, India, ISBN: 978-93-5291-457-9.
• Fayyad Usama and Piatetsky-Shapiro Gregory (1996): “From Data Mining to Knowledge Discovery in Databases” Artificial Intelligence Magazine, V-17, I-3, PP-37-54.
• Pandey Kamlesh(2014): “An Analytical and Comparative Study of Various Data Preprocessing Method in Data Mining” International Journal of Emerging Technology and Advanced Engineering (ISSN
2250-2459), V-4, I-10, PP 174 to 180.
• Zicari, R. (2014): “Big Data: Challenges and Opportunities”, Chapman and Hall/CRC, pp. 103–128.
• Kaisler Stephen, Armour Frank and Espinosa J. Alberto (2013), “Big Data: Issues and Challenges Moving Forward”, Published in the Proc. Of 46th Hawaii International Conference on System Sciences
Published by IEEE held in Wailea, Maui, HI, the USA at 7-10 Jan. 2013.
• Chen Min, Mao Shiwen, Zhang Yin, and Leung Victor C.M. (2014): “Big Data Related Technologies, Challenges, and Future Prospects”, Springer Briefs in Computer Science, ISSN 2191-5776 (electronic).
• Wu Xindong, Zhu Xingquan, Wu Gong-Qing, and Ding Wei (2014), “Data Mining with Big Data”, IEEE Transactions on Knowledge & Data Engineering, VOL. 26, NO. 1, pp 97-107.
• R. Kamala, MaryGladence L. (2015), “An optimal approach for social data analysis in Big Data”, Published in the Proc. of ICCPEIC Published by IEEE held in 22-23 April 2015 at Chennai, India, pp 205-
208.
challenges of big data to big data mining with their processing framework

More Related Content

PDF
Big data: Challenges, Practices and Technologies
PDF
Big data issues and challenges
PPTX
Big data
PDF
Big Data: Issues and Challenges
PDF
Addressing Big Data Challenges - The Hadoop Way
PPTX
Big data ppt
PPTX
Data Mining Algorithm and New HRDSD Theory for Big Data
PPTX
Big data: Challenges, Practices and Technologies
Big data issues and challenges
Big data
Big Data: Issues and Challenges
Addressing Big Data Challenges - The Hadoop Way
Big data ppt
Data Mining Algorithm and New HRDSD Theory for Big Data

What's hot (20)

PDF
Challenges of Big Data Research
PDF
Data Mining and Big Data Challenges and Research Opportunities
PPTX
Motivation for big data
PDF
An Comprehensive Study of Big Data Environment and its Challenges.
PPT
Elementary Concepts of data minig
PDF
elgendy2014.pdf
PDF
Data minig with Big data analysis
PPTX
Data mining & big data presentation 01
PDF
Data quality - The True Big Data Challenge
PPTX
Big data analysis
PPTX
Big Data Analytics
PDF
Issues, challenges, and solutions
PDF
Sameer Kumar Das International Conference Paper 53
PPTX
000 introduction to big data analytics 2021
PPTX
Big data Analytics in Information Technology
PDF
Introduction to big data
PDF
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
PDF
M.Florence Dayana
PPTX
Data mining on big data
PDF
A Model Design of Big Data Processing using HACE Theorem
Challenges of Big Data Research
Data Mining and Big Data Challenges and Research Opportunities
Motivation for big data
An Comprehensive Study of Big Data Environment and its Challenges.
Elementary Concepts of data minig
elgendy2014.pdf
Data minig with Big data analysis
Data mining & big data presentation 01
Data quality - The True Big Data Challenge
Big data analysis
Big Data Analytics
Issues, challenges, and solutions
Sameer Kumar Das International Conference Paper 53
000 introduction to big data analytics 2021
Big data Analytics in Information Technology
Introduction to big data
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
M.Florence Dayana
Data mining on big data
A Model Design of Big Data Processing using HACE Theorem
Ad

Similar to challenges of big data to big data mining with their processing framework (20)

PDF
data mining
PPTX
NCME Big Data in Education
PDF
Big data trends in 2020
PPTX
Big data characteristics, value chain and challenges
PPTX
BIG DATA ANALYTICS.pptx
PPTX
Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...
PPTX
PPTX
Data Warehouse
PPT
01-introduction.ppt the paper that you can unless you want to join me because...
PPTX
Data Mining : Concepts and Techniques
PDF
00-01 DSnDA.pdf
PPTX
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
PPTX
DataScience.pptx
PDF
A Review Of Data Mining Literature
PDF
New research articles 2020 august issue- international journal of computer ...
PDF
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
PPT
Data mining Introduction
PPTX
Pertemuan 1 - Data Mining (Introduction).pptx
PDF
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
PPTX
Big data
data mining
NCME Big Data in Education
Big data trends in 2020
Big data characteristics, value chain and challenges
BIG DATA ANALYTICS.pptx
Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...
Data Warehouse
01-introduction.ppt the paper that you can unless you want to join me because...
Data Mining : Concepts and Techniques
00-01 DSnDA.pdf
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
DataScience.pptx
A Review Of Data Mining Literature
New research articles 2020 august issue- international journal of computer ...
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
Data mining Introduction
Pertemuan 1 - Data Mining (Introduction).pptx
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Big data
Ad

Recently uploaded (20)

PDF
Fluorescence-microscope_Botany_detailed content
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Computer network topology notes for revision
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
Quality review (1)_presentation of this 21
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
annual-report-2024-2025 original latest.
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Fluorescence-microscope_Botany_detailed content
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Reliability_Chapter_ presentation 1221.5784
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Supervised vs unsupervised machine learning algorithms
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Computer network topology notes for revision
STUDY DESIGN details- Lt Col Maksud (21).pptx
Quality review (1)_presentation of this 21
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Miokarditis (Inflamasi pada Otot Jantung)
annual-report-2024-2025 original latest.
Qualitative Qantitative and Mixed Methods.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...

challenges of big data to big data mining with their processing framework

  • 1. Paper Presentation on Challenges of Big Data to Big Data Mining with their Processing Framework Kamlesh Kumar Pandey Dept. of Computer Science & Applications Dr. Hari Singh Gour Vishwavidyalaya, Sagar, M.P E-mail: kamleshamk@gmail.com International Conference on Communication Systems and Network Technologies 2018
  • 2. Content • Big Data • Big Data Mining • Data challenges • Process challenges • Management Challenges • Big Data Mining Processing Framework
  • 3. BIG DATA • Diebold et Al. (2000) is a first writer who discussed the word Big Data in his research paper. All of these authors define Big Data there means if the data set is large then gigabyte then these type of data set is known as Big Data. • Doug Laney et al (2001) was the first person who gave a proper definition for Big Data. He gave three characteristics Volume, Variety, and Velocity of Big Data and these characteristics known as 3 V’s of Big Data Management. Basically, these 3 V’s describe the framework of Big Data. • Gartner (2012), “Big data is high-volume, high-velocity and high- variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making”
  • 4. BIG DATA V’s • In present time seven V’s used for Big Data where the first three V’s Volume, Variety, and Velocity are the main characteristics of big data. In addition to Variability, Value, Veracity, and Visualization are depending on the organization.
  • 5. BIG DATA MINING • Big Data Mining fetching on the requested information, uncovering hidden relationship or patterns or extracting for the needed information or knowledge from a dataset these datasets have to meet three V’s of Big Data with higher complexity.
  • 6. CHALLENGES OF BIG DATA MINING • Data challenges, • Process challenges • Management challenges • Data challenges are based on the basic characteristics such as volume, variety, velocity, veracity etc. of the Big Data. These type of challenges differ from traditional data characteristics. • Process challenges are based on the technique for data mining, data processing or analysis in which algorithms are used to mining or analysis, integration, transform, preprocessing on data etc. • Management challenges are cover to data management related challenges like privacy, security, governance, and other aspects.
  • 7. DATA CHALLENGES • Roberto V. Zicari et al. (2014) and Uthayasankar Sivarajah et al. (2017) are categorizing data challenges in seven categories. • Volume • Variety • Velocity • Variability • Value • Veracity • Visualization
  • 8. PROCESS CHALLENGES • Kaisler et al. (2013) and Uthayasankar Sivarajah et al. (2017) identify data processing related challenges that can be classify into five steps for data mining. • Data acquisition and warehousing • Data cleaning • Data analysis and Mining • Data integration and aggregation • Data querying and indexing
  • 9. MANAGEMENT CHALLENGES • Uthayasankar Sivarajah et al (2017) has discussed various Management challenges which are ensuring data are used correctly, data access where used by only authorized person, without any permission data are not accessible, which maintains privacy, given higher security from external and internal attack, the proper way of transformed and derived data etc. • Privacy • Security • Data and information sharing • Cost/operational expenditures • Data ownership
  • 10. BIG DATA MINING PROCESSING FRAMEWORK • Wu Xindong et al. (2014) presents a HACE theorem and big data processing model for big data mining process and challenges perspective. This big data mining processing model cover to data and management driven challenges.
  • 11. References • Fan Wei and Bifet Albert (2012): “Mining Big Data: Current Status, and Forecast to the Future”, ACM SIGKDD Explorations Newsletter, V-14, I-2, pp1-5. • K.U. Jaseena and David M. (2014): “Issue Challenges and Solution: Big Data Mining”, Published in the Proc. Of SMTP-2014, Published By AIRCC Publishing Corporation, held in Chennai, India on 27-28 Dec 2014, pp 131-140. • Landset Sara, Khoshgoftaar Taghi M, Richter Aaron N. and Hasanin Tawfiq(2015): “A survey of open source tools for machine learning with big data in the Hadoop ecosystem”, Journal of Big Data, 2: 24K. Elissa, “Title of paper if known,” unpublished. • Sivarajah Uthayasankar and Mustafa Kamal Muhammad (2017): “Critical analysis of Big Data challenges and analytical methods”, Journal of Business Research (Elsevier), V-70, PP 263-286. • Najafabadi Maryam M, Villanustre Flavio, Khoshgoftaar Taghi M, Seliya Naeem, Wald Randall and Muharemagic Edin (2015): “Deep learning applications and challenges in big data analytics”, Journal of Big Data, 2:1. • Bifet Albert, (2013), “Mining Big data in Real-time”, Informatica, V-37, I-1, PP 15-20. • Che Dunren, Safran Mejdl and Peng Zhiyong (2013): “From Big Data to Big Data Mining: Challenges, Issues, and Opportunities”, Published in the Proc. Of International Conference on Database Systems for Advanced Applications Organized & Published by Springer held in Suzhou, China in March 2017, PP 1 to 15. • Gandomi Amir and Haider Murtaza(2015): “Beyond the hype: Big data concepts, methods, and analytics”, International Journal of Information Management, Published By Springer, V-35, PP 137 to 144. • Pandey Kamlesh (2018),: “Mining on Relationship in Big Data era Using Apriori Algorithm”, Published in the Proc. Of National Conference on Data Analytics, Machine Learning and Security to be held on 15-16 February 2018 by Department of CSIT, GGV, Bilaspur, C.G, India, ISBN: 978-93-5291-457-9. • Fayyad Usama and Piatetsky-Shapiro Gregory (1996): “From Data Mining to Knowledge Discovery in Databases” Artificial Intelligence Magazine, V-17, I-3, PP-37-54. • Pandey Kamlesh(2014): “An Analytical and Comparative Study of Various Data Preprocessing Method in Data Mining” International Journal of Emerging Technology and Advanced Engineering (ISSN 2250-2459), V-4, I-10, PP 174 to 180. • Zicari, R. (2014): “Big Data: Challenges and Opportunities”, Chapman and Hall/CRC, pp. 103–128. • Kaisler Stephen, Armour Frank and Espinosa J. Alberto (2013), “Big Data: Issues and Challenges Moving Forward”, Published in the Proc. Of 46th Hawaii International Conference on System Sciences Published by IEEE held in Wailea, Maui, HI, the USA at 7-10 Jan. 2013. • Chen Min, Mao Shiwen, Zhang Yin, and Leung Victor C.M. (2014): “Big Data Related Technologies, Challenges, and Future Prospects”, Springer Briefs in Computer Science, ISSN 2191-5776 (electronic). • Wu Xindong, Zhu Xingquan, Wu Gong-Qing, and Ding Wei (2014), “Data Mining with Big Data”, IEEE Transactions on Knowledge & Data Engineering, VOL. 26, NO. 1, pp 97-107. • R. Kamala, MaryGladence L. (2015), “An optimal approach for social data analysis in Big Data”, Published in the Proc. of ICCPEIC Published by IEEE held in 22-23 April 2015 at Chennai, India, pp 205- 208.