SlideShare a Scribd company logo
Introduction toXLMiner™DATA UtilitiesXLMiner and Microsoft Office are registered trademarks of the respective owners.
Brief description of the features of XLMiner:Data UtilitiesThe XLMiner provides the user with a host of Data Utilities at his disposal. They are:	The different Data Utilities that XLMiner Provides are:-Sample from Worksheet/Database.Simple Random sample.
Stratified Sampling.Missing Data handling.Bin Continuous Data.Transform Categorical Data .http://dataminingtools.net
Sample data from WorksheetWhen huge amounts of data are involved, statisticians prefer taking a sample of the data that represents the entire database. However, such a representative sample is very difficult to obtain. The entire dataset we want information about is called the population. A sample is a part of population that we actually examine to draw conclusions. A good sample should be a true representation of data. As far as possible the cases chosen for sample should be like the cases that are not chosen. If the sample design is poor it can produce misleading conclusions. Various methods and techniques are developed to ensure a true sample.XLMiner provides us sampling facilities.http://dataminingtools.net
Sample data from WorksheetIn XLMiner, sampling can be done in two ways:Simple Random sampling:	A random sample of x records is chosen from the data such that every record in that sample has an equal chance of being chosenStratified Sampling :	The data is divided into strata of similar items. Then each stratum is sampled using the simple random approach and the results are then combined to give a final sample.http://dataminingtools.net
Sample data from Worksheet- Simple Random SamplingSelect the variables to be present in the sampleHere “Simple Random sampling is selectedWe can specify the seed value( value used for random selection) or the wizard will specify it by default.Set the size for the sampled setIf selected duplicate copies of records may be used.http://dataminingtools.net
Sample data from Worksheet- Simple Random Sampling outputhttp://dataminingtools.net
Sample data from Worksheet- Simple Random Sampling output with replacement.Duplicate copies of record exist in the sample.http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( proportionate )http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( proportionate – output )As selected by us, the % of records in each stratum in the sample set is same as that in the input sethttp://dataminingtools.net
Sample data from Worksheet- Stratified Sample(specify number)http://dataminingtools.net
Sample data from Worksheet- Stratified Sample(specify number)All stratums have equal sizes as specified by user (here 10 records each)http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( size of smallest stratum)http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( size of smallest stratum-output)All stratum have size equal to the size of the smallest stratumhttp://dataminingtools.net
Missing Data HandlingThis utility allows the user to process the data before any mining method is applied on it. It allows the user to detect the missing values in the data and handle them the way the user wants. XLMiner� considers a cell to be missing data if it is empty or contains an invalid formula. XLMiner� can be prompted to treat a cell to be missing data  if it contains a certain value specified by the user or handles the data as specified by the user.The user can specify how XLMiner� should correct these missing values. A treatment can be assigned for every variable. The records with missing data can be either deleted fully or the missing values can be replaced.  XLMiner� provides options on how to replace the missing data, e.g. by mean or median or mode or a value specified by the user. The available options depend on the type of variablehttp://dataminingtools.net
Missing Data Handlinghttp://dataminingtools.net
Missing Data HandlingData SetSelect the action to handle the missing data in individual columns and click on “Apply this option to selected variable”http://dataminingtools.net
Missing Data Handling-OutputChanged records high-lightedhttp://dataminingtools.net
Transform Categorical DataSometimes our data sets may contain variables that take non-numeric values. This makes it difficult to apply standard procedures. Hence XLMiner provides us with a tool which can be used to rename (transform) non-numeric data to numeric data.There are two ways to transform  categorical data:Creating Dummies: Consider the variable to have 4 distinct values as A,B,C and D. Then 3 new rows, VAL1,VAL2, VAL3 are created with values either 1 or 0 .If row one contains value A the VAL1 will have a value 1,rest have 0.If all have 0,then the row has a value D.Create category scores: In this if the non-numeric holds 4 distinct values as above, each value( ordered alphabetically) will be numbered from 1 to 4 and a new column is created that contains the value of number the non-numeric variable corresponds to.http://dataminingtools.net
Transform Categorical Data- DummiesSelect the variable that contains non-numeric Data and needs to be transformedhttp://dataminingtools.net
Transform Categorical Data-Category Scoreshttp://dataminingtools.net
Transform Categorical Data-Category Scores(output)http://dataminingtools.net
Thank youFor more visit:http://dataminingtools.nethttp://dataminingtools.net

More Related Content

PPTX
Introduction To XL-Miner
PPTX
XL-MINER:Prediction
PPTX
XL Miner: Classification
PPT
Xlminer demo
PPTX
XL-MINER:Partition
PPTX
XL-MINER: Associations
PPTX
XL-MINER: Data Exploration
PPTX
Dma unit 2
Introduction To XL-Miner
XL-MINER:Prediction
XL Miner: Classification
Xlminer demo
XL-MINER:Partition
XL-MINER: Associations
XL-MINER: Data Exploration
Dma unit 2

What's hot (17)

PPTX
Classification
PPT
Dma unit 1
PPTX
DATA PREPROCESSING AND DATA CLEANSING
PPTX
Dsa unit 1
PPT
Data Processing-Presentation
PPTX
Decision tree induction
PPTX
WEKA: Data Mining Input Concepts Instances And Attributes
PPT
Excel Datamining Addin Advanced
PPT
Fundamental of SPSS
PPTX
Creating a histogram
PPT
Excel Datamining Addin Beginner
PPT
Data preprocessing
PPTX
Trending Topics in Machine Learning
PPT
Data Mining with WEKA WEKA
PPTX
Analytics machine learning in weka
DOC
Data Mining: Data Preprocessing
PPTX
Data processing and analysis final
Classification
Dma unit 1
DATA PREPROCESSING AND DATA CLEANSING
Dsa unit 1
Data Processing-Presentation
Decision tree induction
WEKA: Data Mining Input Concepts Instances And Attributes
Excel Datamining Addin Advanced
Fundamental of SPSS
Creating a histogram
Excel Datamining Addin Beginner
Data preprocessing
Trending Topics in Machine Learning
Data Mining with WEKA WEKA
Analytics machine learning in weka
Data Mining: Data Preprocessing
Data processing and analysis final
Ad

Viewers also liked (20)

PPTX
XL-MINER:Partition
PPTX
MS Sql Server: Manipulating Database
PDF
Huidige status van de testtaal TTCN-3
PDF
Cinnamonhotel saigon 2013_01
PDF
Direct-services portfolio
PPTX
MS Sql Server: Deleting A Database
PPTX
Txomin Hartz Txikia
PDF
Ontwikkeling In Eigen Handen Nl Web
PPTX
Retrieving Data From A Database
PPTX
Txomin Hartz Txikia
PPTX
LISP: Macros in lisp
PPT
LíRica Latina 2ºBac Lara Lozano
PPTX
DataKraft - Powerful No-Coding Platform for Business Applications
PPTX
Procedures And Functions in Matlab
PPTX
R: Apply Functions
PPTX
LISP:Object System Lisp
PPTX
Probability And Its Axioms
PPTX
Data Applied: Association
PPTX
SPSS: Quick Look
PPTX
MS Sql Server: Doing Calculations With Functions
XL-MINER:Partition
MS Sql Server: Manipulating Database
Huidige status van de testtaal TTCN-3
Cinnamonhotel saigon 2013_01
Direct-services portfolio
MS Sql Server: Deleting A Database
Txomin Hartz Txikia
Ontwikkeling In Eigen Handen Nl Web
Retrieving Data From A Database
Txomin Hartz Txikia
LISP: Macros in lisp
LíRica Latina 2ºBac Lara Lozano
DataKraft - Powerful No-Coding Platform for Business Applications
Procedures And Functions in Matlab
R: Apply Functions
LISP:Object System Lisp
Probability And Its Axioms
Data Applied: Association
SPSS: Quick Look
MS Sql Server: Doing Calculations With Functions
Ad

Similar to XL-MINER: Data Utilities (20)

PPTX
XL-MINER:Introduction To Xl Miner
PPTX
Machine learning module 2
PPT
Excel Datamining Addin Advanced
PPTX
pjgjhkjhkjhkkhkhkkhkjhjhjhjkhjhjkhjhroject.pptx
PPT
Excel Datamining Addin Beginner
PPTX
PATTERNS08 - Strong Typing and Data Validation in .NET
PPTX
UNIT 2: Part 2: Data Warehousing and Data Mining
PDF
data mining
PPTX
mod3part 3 of robotic process automation
PPTX
3. chapter iii(aggregate data)
PDF
Data Science Interview Questions PDF By ScholarHat
PPT
Computer notes - data structures
PDF
somhelpdoc
DOCX
Concept of Classification in Data Mining.docx
PPTX
Datamanipulationcases in data analysis.pptx
PPTX
Unit-IV-Introduction to Data Warehousing .pptx
PPTX
Data Preprocessing
PPTX
Introduction to data mining
PPTX
XL MINER: Associations
PPTX
Predicting Employee Churn: A Data-Driven Approach Project Presentation
XL-MINER:Introduction To Xl Miner
Machine learning module 2
Excel Datamining Addin Advanced
pjgjhkjhkjhkkhkhkkhkjhjhjhjkhjhjkhjhroject.pptx
Excel Datamining Addin Beginner
PATTERNS08 - Strong Typing and Data Validation in .NET
UNIT 2: Part 2: Data Warehousing and Data Mining
data mining
mod3part 3 of robotic process automation
3. chapter iii(aggregate data)
Data Science Interview Questions PDF By ScholarHat
Computer notes - data structures
somhelpdoc
Concept of Classification in Data Mining.docx
Datamanipulationcases in data analysis.pptx
Unit-IV-Introduction to Data Warehousing .pptx
Data Preprocessing
Introduction to data mining
XL MINER: Associations
Predicting Employee Churn: A Data-Driven Approach Project Presentation

More from DataminingTools Inc (20)

PPTX
Terminology Machine Learning
PPTX
Techniques Machine Learning
PPTX
Machine learning Introduction
PPTX
Areas of machine leanring
PPTX
AI: Planning and AI
PPTX
AI: Logic in AI 2
PPTX
AI: Logic in AI
PPTX
AI: Learning in AI 2
PPTX
AI: Learning in AI
PPTX
AI: Introduction to artificial intelligence
PPTX
AI: Belief Networks
PPTX
AI: AI & Searching
PPTX
AI: AI & Problem Solving
PPTX
Data Mining: Text and web mining
PPTX
Data Mining: Outlier analysis
PPTX
Data Mining: Mining stream time series and sequence data
PPTX
Data Mining: Mining ,associations, and correlations
PPTX
Data Mining: Graph mining and social network analysis
PPTX
Data warehouse and olap technology
PPTX
Data Mining: Data processing
Terminology Machine Learning
Techniques Machine Learning
Machine learning Introduction
Areas of machine leanring
AI: Planning and AI
AI: Logic in AI 2
AI: Logic in AI
AI: Learning in AI 2
AI: Learning in AI
AI: Introduction to artificial intelligence
AI: Belief Networks
AI: AI & Searching
AI: AI & Problem Solving
Data Mining: Text and web mining
Data Mining: Outlier analysis
Data Mining: Mining stream time series and sequence data
Data Mining: Mining ,associations, and correlations
Data Mining: Graph mining and social network analysis
Data warehouse and olap technology
Data Mining: Data processing

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
KodekX | Application Modernization Development
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Empathic Computing: Creating Shared Understanding
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
MIND Revenue Release Quarter 2 2025 Press Release
sap open course for s4hana steps from ECC to s4
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Network Security Unit 5.pdf for BCA BBA.
Spectral efficient network and resource selection model in 5G networks
Programs and apps: productivity, graphics, security and other tools
20250228 LYD VKU AI Blended-Learning.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Mobile App Security Testing_ A Comprehensive Guide.pdf
MYSQL Presentation for SQL database connectivity
Reach Out and Touch Someone: Haptics and Empathic Computing
KodekX | Application Modernization Development
Big Data Technologies - Introduction.pptx
Spectroscopy.pptx food analysis technology
Empathic Computing: Creating Shared Understanding
The Rise and Fall of 3GPP – Time for a Sabbatical?
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Chapter 3 Spatial Domain Image Processing.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

XL-MINER: Data Utilities

  • 1. Introduction toXLMiner™DATA UtilitiesXLMiner and Microsoft Office are registered trademarks of the respective owners.
  • 2. Brief description of the features of XLMiner:Data UtilitiesThe XLMiner provides the user with a host of Data Utilities at his disposal. They are: The different Data Utilities that XLMiner Provides are:-Sample from Worksheet/Database.Simple Random sample.
  • 3. Stratified Sampling.Missing Data handling.Bin Continuous Data.Transform Categorical Data .http://dataminingtools.net
  • 4. Sample data from WorksheetWhen huge amounts of data are involved, statisticians prefer taking a sample of the data that represents the entire database. However, such a representative sample is very difficult to obtain. The entire dataset we want information about is called the population. A sample is a part of population that we actually examine to draw conclusions. A good sample should be a true representation of data. As far as possible the cases chosen for sample should be like the cases that are not chosen. If the sample design is poor it can produce misleading conclusions. Various methods and techniques are developed to ensure a true sample.XLMiner provides us sampling facilities.http://dataminingtools.net
  • 5. Sample data from WorksheetIn XLMiner, sampling can be done in two ways:Simple Random sampling: A random sample of x records is chosen from the data such that every record in that sample has an equal chance of being chosenStratified Sampling : The data is divided into strata of similar items. Then each stratum is sampled using the simple random approach and the results are then combined to give a final sample.http://dataminingtools.net
  • 6. Sample data from Worksheet- Simple Random SamplingSelect the variables to be present in the sampleHere “Simple Random sampling is selectedWe can specify the seed value( value used for random selection) or the wizard will specify it by default.Set the size for the sampled setIf selected duplicate copies of records may be used.http://dataminingtools.net
  • 7. Sample data from Worksheet- Simple Random Sampling outputhttp://dataminingtools.net
  • 8. Sample data from Worksheet- Simple Random Sampling output with replacement.Duplicate copies of record exist in the sample.http://dataminingtools.net
  • 9. Sample data from Worksheet- Stratified Sample( proportionate )http://dataminingtools.net
  • 10. Sample data from Worksheet- Stratified Sample( proportionate – output )As selected by us, the % of records in each stratum in the sample set is same as that in the input sethttp://dataminingtools.net
  • 11. Sample data from Worksheet- Stratified Sample(specify number)http://dataminingtools.net
  • 12. Sample data from Worksheet- Stratified Sample(specify number)All stratums have equal sizes as specified by user (here 10 records each)http://dataminingtools.net
  • 13. Sample data from Worksheet- Stratified Sample( size of smallest stratum)http://dataminingtools.net
  • 14. Sample data from Worksheet- Stratified Sample( size of smallest stratum-output)All stratum have size equal to the size of the smallest stratumhttp://dataminingtools.net
  • 15. Missing Data HandlingThis utility allows the user to process the data before any mining method is applied on it. It allows the user to detect the missing values in the data and handle them the way the user wants. XLMiner� considers a cell to be missing data if it is empty or contains an invalid formula. XLMiner� can be prompted to treat a cell to be missing data  if it contains a certain value specified by the user or handles the data as specified by the user.The user can specify how XLMiner� should correct these missing values. A treatment can be assigned for every variable. The records with missing data can be either deleted fully or the missing values can be replaced.  XLMiner� provides options on how to replace the missing data, e.g. by mean or median or mode or a value specified by the user. The available options depend on the type of variablehttp://dataminingtools.net
  • 17. Missing Data HandlingData SetSelect the action to handle the missing data in individual columns and click on “Apply this option to selected variable”http://dataminingtools.net
  • 18. Missing Data Handling-OutputChanged records high-lightedhttp://dataminingtools.net
  • 19. Transform Categorical DataSometimes our data sets may contain variables that take non-numeric values. This makes it difficult to apply standard procedures. Hence XLMiner provides us with a tool which can be used to rename (transform) non-numeric data to numeric data.There are two ways to transform categorical data:Creating Dummies: Consider the variable to have 4 distinct values as A,B,C and D. Then 3 new rows, VAL1,VAL2, VAL3 are created with values either 1 or 0 .If row one contains value A the VAL1 will have a value 1,rest have 0.If all have 0,then the row has a value D.Create category scores: In this if the non-numeric holds 4 distinct values as above, each value( ordered alphabetically) will be numbered from 1 to 4 and a new column is created that contains the value of number the non-numeric variable corresponds to.http://dataminingtools.net
  • 20. Transform Categorical Data- DummiesSelect the variable that contains non-numeric Data and needs to be transformedhttp://dataminingtools.net
  • 21. Transform Categorical Data-Category Scoreshttp://dataminingtools.net
  • 22. Transform Categorical Data-Category Scores(output)http://dataminingtools.net
  • 23. Thank youFor more visit:http://dataminingtools.nethttp://dataminingtools.net
  • 24. Visit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net