SlideShare a Scribd company logo
Mining the sky
Data analysis in astronomy Data mining techniques are rapidly gaining acceptance in a variety of scientific disciplines.Large amount of data collected in astronomical surveys require the use of semi-automated  techniques for analysisFocus is on extracting useful information from a single survey
Data mining is a multi-disciplinary field, borrowing and enhancing ideas from diverse areas such as signal and image processing, image understanding, statistics, mathematical optimization, computer vision and pattern recognition.Mining scientific data sets is an area rich in mathematical problems.
Use of data mining techniques in astronomyData mining is a process of uncovering patterns, anomalies, and statistically significant structures in dataNeural networks are used to discriminate between stars and galaxies.SKICAT project for star/galaxy makes use of decision trees in the DPOSS survey.
Astro-informaticsProblems in astronomy increasingly require use of machine learning and data mining techniques:Detection of spurious objects
Record image
Object classification and clustering
Compression
Source separationMining a single astronomical surveySurvey is defined by the wavelength of the light used, the depth of the images, and the angular resolution of the images. Data is available in 2 forms-images and a catalog.The original data obtained from the telescope is images, after some processing a catalog is obtained which has information about every object in the image.It is the catalog that’s got more importance than images in the survey.
Issues in astronomyCompression(ex: galaxy images, spectra)Classification(ex: stars, galaxies or gamma ray bursts)Reconstruction(ex: blurred galaxy images, mass distribution from week gravitational lensing)Feature extraction(signatures features of stars, galaxies and quasers)Parameter estimation(ex: star parameter measurement, photometric redshift prediction, cosmological parameters)Model selection( ex: are there 0,1,2,…. Patterns around the star or is there a cosmological model with non-zero nutrino mass more favorable.
Science requirements for data miningCross-identification: classical problem of associating the source list of one database to the source list of the other.Cross-correlation: search for co-relations, tendencies and trends between physical parameters in multi-dimensional data.Nearest-neighbor identification: general application of clustering algorithms in multi-dimensional parameter space, usually within a database.Systematic data exploration: application of broad range of event based and relationship based queries to a database in the hope of making a discovery of new objects or a class of new objects.

More Related Content

PPT
(Talk in Powerpoint Format)
PDF
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
PPTX
Start up of earth observation by a small laboratory
PPT
Application of web ontology to harvest estimation of rice in Thailand
PPT
Application of web ontology to harvest estimation of rice in thailand
PPTX
Exploiting Hierarchical Context on a Large Database of Object Categories
PDF
When The New Science Is In The Outliers
PDF
A Machine Learning Framework for Materials Knowledge Systems
(Talk in Powerpoint Format)
Enabling Real Time Analysis & Decision Making - A Paradigm Shift for Experime...
Start up of earth observation by a small laboratory
Application of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in thailand
Exploiting Hierarchical Context on a Large Database of Object Categories
When The New Science Is In The Outliers
A Machine Learning Framework for Materials Knowledge Systems

What's hot (20)

PPTX
AI at Scale for Materials and Chemistry
PDF
Poster: Monash Research Month 2007
PDF
Clustering
PPTX
Machine Learning in Materials Science and Chemistry, USPTO, Nathan C. Frey
PDF
Research Day
PPT
Subspace discriminant approach_hyperspectral
PDF
OpticalCamouflageResume
PDF
Physics inspired artificial intelligence/machine learning
PPTX
Data ming wsn
PDF
AI & Bio Medical Presentation @JoshArnold et al
PDF
Predicting local atomic structures from X-ray absorption spectroscopy using t...
PDF
The MGI and AI
PDF
Smart Metrics for High Performance Material Design
DOCX
Intelligent generator of big data medical
PDF
Graphs, Environments, and Machine Learning for Materials Science
PDF
GeoAI: A Model-Agnostic Meta-Ensemble Zero-Shot Learning Method for Hyperspec...
PDF
Cloud-Based Solutions for Scientific Computing
PPTX
Approaches to Mining Large-Scale Heterogeneous Data: Old and New
PPT
NOISE-ROBUST SPATIAL PREPROCESSING PRIOR TO ENDMEMBER EXTRACTION FROM HYPERSP...
PPTX
Braintalk cuso nm
AI at Scale for Materials and Chemistry
Poster: Monash Research Month 2007
Clustering
Machine Learning in Materials Science and Chemistry, USPTO, Nathan C. Frey
Research Day
Subspace discriminant approach_hyperspectral
OpticalCamouflageResume
Physics inspired artificial intelligence/machine learning
Data ming wsn
AI & Bio Medical Presentation @JoshArnold et al
Predicting local atomic structures from X-ray absorption spectroscopy using t...
The MGI and AI
Smart Metrics for High Performance Material Design
Intelligent generator of big data medical
Graphs, Environments, and Machine Learning for Materials Science
GeoAI: A Model-Agnostic Meta-Ensemble Zero-Shot Learning Method for Hyperspec...
Cloud-Based Solutions for Scientific Computing
Approaches to Mining Large-Scale Heterogeneous Data: Old and New
NOISE-ROBUST SPATIAL PREPROCESSING PRIOR TO ENDMEMBER EXTRACTION FROM HYPERSP...
Braintalk cuso nm
Ad

Viewers also liked (20)

PPTX
WEKA: Credibility Evaluating Whats Been Learned
PPTX
MS Sql Server: Reporting introduction
PPTX
Txomin Hartz Txikia
PPTX
PPS
Quantica Construction Search
PPTX
LISP: Scope and extent in lisp
DOC
建築師法修正草案總說明
PPTX
Introduction to Data-Applied
PPTX
LISP: Errors In Lisp
PPTX
Control Statements in Matlab
PPT
Festivals Refuerzo
PPT
Paramount Search Partners
ODP
Miedo Jajjjajajja
PPT
Survival Strategies For Testers
PPTX
Quick Look At Clustering
PPT
Wisconsin Fertility Institute: Injection Class 2011
PPTX
DataKraft - Powerful No-Coding Platform for Business Applications
PPT
Facebook: An Innovative Influenza Pandemic Early Warning System
XLSX
PPTX
RapidMiner: Nested Subprocesses
WEKA: Credibility Evaluating Whats Been Learned
MS Sql Server: Reporting introduction
Txomin Hartz Txikia
Quantica Construction Search
LISP: Scope and extent in lisp
建築師法修正草案總說明
Introduction to Data-Applied
LISP: Errors In Lisp
Control Statements in Matlab
Festivals Refuerzo
Paramount Search Partners
Miedo Jajjjajajja
Survival Strategies For Testers
Quick Look At Clustering
Wisconsin Fertility Institute: Injection Class 2011
DataKraft - Powerful No-Coding Platform for Business Applications
Facebook: An Innovative Influenza Pandemic Early Warning System
RapidMiner: Nested Subprocesses
Ad

Similar to Data Mining The Sky (20)

DOCX
Machine learning astronomical structure
PDF
AUTOMATIC SPECTRAL CLASSIFICATION OF STARS USING MACHINE LEARNING: AN APPROAC...
PDF
Automatic Spectral Classification of Stars using Machine Learning: An Approac...
PDF
Automatic Spectral Classification of Stars using Machine Learning: An Approac...
PDF
SpectralClassificationOfStars
PPS
UHDMML.pps
PDF
Astronomical Data Processing on the LSST Scale with Apache Spark
PDF
IRJET- Deep Convolution Neural Networks for Galaxy Morphology Classification
PDF
Identifying Exoplanets with Machine Learning Methods: A Preliminary Study
PDF
17 manjula aakunuri final_paper--185-190
PDF
Astronomaly at Scale: Searching for Anomalies Amongst 4 Million Galaxies
PDF
I1803026164
PPTX
LSST Solar System Science: MOPS Status, the Science, and Your Questions
PDF
PggLas12
DOCX
A Survey on Cluster Based Outlier Detection Techniques in Data Stream
PDF
CV-XiaoyiDong-2016July-academic
PPTX
AAG_2011
PDF
TMS workshop on machine learning in materials science: Intro to deep learning...
PDF
Introduction to Data Mining
Machine learning astronomical structure
AUTOMATIC SPECTRAL CLASSIFICATION OF STARS USING MACHINE LEARNING: AN APPROAC...
Automatic Spectral Classification of Stars using Machine Learning: An Approac...
Automatic Spectral Classification of Stars using Machine Learning: An Approac...
SpectralClassificationOfStars
UHDMML.pps
Astronomical Data Processing on the LSST Scale with Apache Spark
IRJET- Deep Convolution Neural Networks for Galaxy Morphology Classification
Identifying Exoplanets with Machine Learning Methods: A Preliminary Study
17 manjula aakunuri final_paper--185-190
Astronomaly at Scale: Searching for Anomalies Amongst 4 Million Galaxies
I1803026164
LSST Solar System Science: MOPS Status, the Science, and Your Questions
PggLas12
A Survey on Cluster Based Outlier Detection Techniques in Data Stream
CV-XiaoyiDong-2016July-academic
AAG_2011
TMS workshop on machine learning in materials science: Intro to deep learning...
Introduction to Data Mining

More from DataminingTools Inc (20)

PPTX
Terminology Machine Learning
PPTX
Techniques Machine Learning
PPTX
Machine learning Introduction
PPTX
Areas of machine leanring
PPTX
AI: Planning and AI
PPTX
AI: Logic in AI 2
PPTX
AI: Logic in AI
PPTX
AI: Learning in AI 2
PPTX
AI: Learning in AI
PPTX
AI: Introduction to artificial intelligence
PPTX
AI: Belief Networks
PPTX
AI: AI & Searching
PPTX
AI: AI & Problem Solving
PPTX
Data Mining: Text and web mining
PPTX
Data Mining: Outlier analysis
PPTX
Data Mining: Mining stream time series and sequence data
PPTX
Data Mining: Mining ,associations, and correlations
PPTX
Data Mining: Graph mining and social network analysis
PPTX
Data warehouse and olap technology
PPTX
Data Mining: Data processing
Terminology Machine Learning
Techniques Machine Learning
Machine learning Introduction
Areas of machine leanring
AI: Planning and AI
AI: Logic in AI 2
AI: Logic in AI
AI: Learning in AI 2
AI: Learning in AI
AI: Introduction to artificial intelligence
AI: Belief Networks
AI: AI & Searching
AI: AI & Problem Solving
Data Mining: Text and web mining
Data Mining: Outlier analysis
Data Mining: Mining stream time series and sequence data
Data Mining: Mining ,associations, and correlations
Data Mining: Graph mining and social network analysis
Data warehouse and olap technology
Data Mining: Data processing

Recently uploaded (20)

PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPT
Teaching material agriculture food technology
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
KodekX | Application Modernization Development
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Chapter 3 Spatial Domain Image Processing.pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Machine learning based COVID-19 study performance prediction
Big Data Technologies - Introduction.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Building Integrated photovoltaic BIPV_UPV.pdf
Teaching material agriculture food technology
Spectral efficient network and resource selection model in 5G networks
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
KodekX | Application Modernization Development
NewMind AI Monthly Chronicles - July 2025
Review of recent advances in non-invasive hemoglobin estimation
Dropbox Q2 2025 Financial Results & Investor Presentation
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Understanding_Digital_Forensics_Presentation.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf

Data Mining The Sky

  • 2. Data analysis in astronomy Data mining techniques are rapidly gaining acceptance in a variety of scientific disciplines.Large amount of data collected in astronomical surveys require the use of semi-automated techniques for analysisFocus is on extracting useful information from a single survey
  • 3. Data mining is a multi-disciplinary field, borrowing and enhancing ideas from diverse areas such as signal and image processing, image understanding, statistics, mathematical optimization, computer vision and pattern recognition.Mining scientific data sets is an area rich in mathematical problems.
  • 4. Use of data mining techniques in astronomyData mining is a process of uncovering patterns, anomalies, and statistically significant structures in dataNeural networks are used to discriminate between stars and galaxies.SKICAT project for star/galaxy makes use of decision trees in the DPOSS survey.
  • 5. Astro-informaticsProblems in astronomy increasingly require use of machine learning and data mining techniques:Detection of spurious objects
  • 9. Source separationMining a single astronomical surveySurvey is defined by the wavelength of the light used, the depth of the images, and the angular resolution of the images. Data is available in 2 forms-images and a catalog.The original data obtained from the telescope is images, after some processing a catalog is obtained which has information about every object in the image.It is the catalog that’s got more importance than images in the survey.
  • 10. Issues in astronomyCompression(ex: galaxy images, spectra)Classification(ex: stars, galaxies or gamma ray bursts)Reconstruction(ex: blurred galaxy images, mass distribution from week gravitational lensing)Feature extraction(signatures features of stars, galaxies and quasers)Parameter estimation(ex: star parameter measurement, photometric redshift prediction, cosmological parameters)Model selection( ex: are there 0,1,2,…. Patterns around the star or is there a cosmological model with non-zero nutrino mass more favorable.
  • 11. Science requirements for data miningCross-identification: classical problem of associating the source list of one database to the source list of the other.Cross-correlation: search for co-relations, tendencies and trends between physical parameters in multi-dimensional data.Nearest-neighbor identification: general application of clustering algorithms in multi-dimensional parameter space, usually within a database.Systematic data exploration: application of broad range of event based and relationship based queries to a database in the hope of making a discovery of new objects or a class of new objects.
  • 12. KDDKDD is automatic extraction of non obvious hidden knowledge from large volumes of data.DM becomes the core of knowledge discovery.KDD process involves:Data mining object
  • 16. EvolutionPrimary tasks of data mining:Classification(finding the description of several predefined classes and classify a data item into one of them)Regression(mapping the data item into a real valued data item)
  • 17. Clustering(discovering the most significant changes in the data)
  • 18. Deviation and change detection(identifying the finite set of clusters or categories in the data)
  • 19. Dependency modeling (finding a model which describes significant dependencies between the variables)
  • 20. Summarization(finding a compact description for the summarization of data)Machine learning and data mining tasks will continue to prove useful with astronomical data bases.