SlideShare a Scribd company logo
Finding Datasets and Statistics
What is Data? 
da·ta noun plural but singular or plural in construction, often 
attributive ˈdā-tə, ˈda- also ˈdä- 
1: factual information (as measurements or statistics) used as a 
basis for reasoning, discussion, or calculation 
2: information output by a sensing device or organ that includes 
both useful and irrelevant or redundant information and must be 
processed to be meaningful 
3: information in numerical form 
that can be digitally transmitted 
or processed 
Merriam-Webster (http://www.merriam-webster.com/dictionary/data)
Data can be 
• Observational: Captured in real-time, 
typically outside the lab 
– Examples: Sensor readings, survey results, 
images, audio, video 
• Experimental: Typically generated in the 
lab or under controlled conditions 
– Examples: test results 
• Simulation: Machine generated from test 
models 
– Examples: climate models, economic models 
• Derived /Compiled: Generated from 
existing datasets 
– Examples: text and data mining, compiled 
database, 3D models
Data can be 
• Text: field or laboratory notes, 
survey responses 
• Numeric: tables, counts, 
measurements 
• Audiovisual: images, sound 
recordings, video 
• Models, computer code, 
geospatial data 
• Discipline-specific: FITS in 
astronomy, CIF in chemistry 
• Instrument-specific: equipment 
outputs
Microdata 
• Data directly observed or collected from a 
specific unit of observation. 
• Contain individual cases, usually individual 
people, or in the case of Census data, 
individual households 
Examples: 
• Census: the unit of observation is probably an 
individual, a household or a family. 
• Survey or poll: the responses of a single respondent
Aggregate Data 
Is higher-level data that have been compiled 
from smaller units of data. 
Examples: inflation rate, consumer price index, 
demographic data for city or state
Statistics 
are numerical data that has 
been organized and 
interpreted, usually 
displayed in tables.
Datasets 
• A dataset or study is 
made up of the raw data 
file and any related files, 
usually the codebook 
and setup files. 
• Most data sets require at 
least basic statistical 
analysis (Stata, SPSS, R, 
etc.) or spreadsheet 
programs (Excel) to use.
Repositories 
• A data repository is a collection of 
datasets that have been 
deposited for storage and 
findability. 
• They are often 
– discipline specific and/or 
– affiliated with a research institution 
• Examples 
– ICPSR 
– Harvard Dataverse Network 
– UC San Diego Digital Collections
To recap: 
• Data are raw ingredients 
from which statistics are 
created. 
• Statistical analysis can be 
performed on data to 
show relationships among 
the variables collected. 
• Through secondary data 
analysis, many different 
researchers can re-use the 
same data set for 
different purposes.
Finding Datasets
1. Think about who might 
collect the data. 
• Could it have been collected by a government 
agency? 
• A nonprofit or nongovernmental organization? 
• A private business or industry group? 
• Academic researchers?
2. Look for publications that use 
the kind of data you’re looking for 
and that cite the dataset 
In other words, is the data you 
want mentioned in scholarly 
articles or government reports 
or some other source?
3. Once you know that what you want 
exists, it's time to hunt it down. 
• Is it freely available on the web? 
• Or part of a package to which the 
library already subscribes? 
• Is it something we can buy? (And is 
it within the library's budget and 
can the purchase be made quickly 
enough to fit your timeframe?) 
• Can it be requested directly from the 
researcher?

More Related Content

PPTX
Polisciguide2017
PPTX
Aep mc nairguide
PPTX
Library research for Environmental Studies at UCSD
PPTX
POLI 122 Library Research Guide
PPTX
Sls guide2018
PPTX
Library research for International Studies at UCSD
PPTX
Immigration guide
PPTX
Library research for UCSD Political Scientists
Polisciguide2017
Aep mc nairguide
Library research for Environmental Studies at UCSD
POLI 122 Library Research Guide
Sls guide2018
Library research for International Studies at UCSD
Immigration guide
Library research for UCSD Political Scientists

What's hot (20)

PPTX
Poli125 guide
PPTX
Library Research for Legal Researchers at UCSD
PPTX
Library Research for Human Rights Guide
PPTX
Poli100q guide
PPTX
Intl190 kahler guide
PPTX
Poli153 guide
PPTX
Hmnr101 guide
PPTX
Lign105 guide
PPTX
SLSguide
PPTX
Academic library orientation for all
PPTX
Poli126aa guide
PPTX
Advanced search topics
PPTX
Intl190 guide (Feeley) 2020
PPTX
Poli120p guide
PPTX
Poli104J&K guide
PPTX
Poli120e guide
PPTX
ECO1010F/ECO1110F Essay Workshop-2019
PPT
TexShare Databases Basic Reference Lesson 1
Poli125 guide
Library Research for Legal Researchers at UCSD
Library Research for Human Rights Guide
Poli100q guide
Intl190 kahler guide
Poli153 guide
Hmnr101 guide
Lign105 guide
SLSguide
Academic library orientation for all
Poli126aa guide
Advanced search topics
Intl190 guide (Feeley) 2020
Poli120p guide
Poli104J&K guide
Poli120e guide
ECO1010F/ECO1110F Essay Workshop-2019
TexShare Databases Basic Reference Lesson 1
Ad

Similar to Data 2014 (20)

PPTX
DataVsStatistics
PPTX
Data and Statistics library research at UCSD
PPTX
Researchpe-5.pptx
PPTX
DATA COLLECTION.pptx DATA COLLECTION.pptxDATA COLLECTION.pptxDATA COLLECTION....
PPTX
Data Science topic and introduction to basic concepts involving data manageme...
PPTX
Data Science Introduction to Data Science
PPTX
Data analytics unit 1 aktu updated syllabus new
PPTX
Chapter 7 Knowing Our Data
PPTX
Chapter 4 Understanding Data and Ways to Systematically Collect Data
PPTX
chapter4-understandingdataandwaystosystematicallycollectdata-170809052400.pptx
PPTX
R programming for data science
PPT
chap1.ppt
PPT
Information_System_and_Data_mining12.ppt
PPT
chap1.ppt
PPT
chap1.ppt
PPTX
Methods of data collection
PPTX
introduction to statistics
PDF
Module 2 Data Collection and Management.pdf
PPT
data analysis.ppt
PPTX
data analysis.pptx
DataVsStatistics
Data and Statistics library research at UCSD
Researchpe-5.pptx
DATA COLLECTION.pptx DATA COLLECTION.pptxDATA COLLECTION.pptxDATA COLLECTION....
Data Science topic and introduction to basic concepts involving data manageme...
Data Science Introduction to Data Science
Data analytics unit 1 aktu updated syllabus new
Chapter 7 Knowing Our Data
Chapter 4 Understanding Data and Ways to Systematically Collect Data
chapter4-understandingdataandwaystosystematicallycollectdata-170809052400.pptx
R programming for data science
chap1.ppt
Information_System_and_Data_mining12.ppt
chap1.ppt
chap1.ppt
Methods of data collection
introduction to statistics
Module 2 Data Collection and Management.pdf
data analysis.ppt
data analysis.pptx
Ad

More from Annelise Sklar (12)

PPTX
Poli160aa guide 2020
PPTX
Poli127 guide (2020)
PPTX
Poli120i guide
PPTX
Poli153 guide 2017
PPTX
Poli160aa guide 2017
PPTX
Poli160aa guide
PPTX
Poli102 guide
PPTX
INTL 190 Libguide
PPTX
Poli151 guide
PPTX
Poli144 guide
PPTX
Intl190 edelman guide
PPTX
Intl190 herberg guide
Poli160aa guide 2020
Poli127 guide (2020)
Poli120i guide
Poli153 guide 2017
Poli160aa guide 2017
Poli160aa guide
Poli102 guide
INTL 190 Libguide
Poli151 guide
Poli144 guide
Intl190 edelman guide
Intl190 herberg guide

Recently uploaded (20)

PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Complications of Minimal Access Surgery at WLH
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Sports Quiz easy sports quiz sports quiz
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
01-Introduction-to-Information-Management.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
Lesson notes of climatology university.
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
Pharmacology of Heart Failure /Pharmacotherapy of CHF
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Complications of Minimal Access Surgery at WLH
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Sports Quiz easy sports quiz sports quiz
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Anesthesia in Laparoscopic Surgery in India
Final Presentation General Medicine 03-08-2024.pptx
01-Introduction-to-Information-Management.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPH.pptx obstetrics and gynecology in nursing
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Microbial disease of the cardiovascular and lymphatic systems
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Lesson notes of climatology university.
O7-L3 Supply Chain Operations - ICLT Program
Abdominal Access Techniques with Prof. Dr. R K Mishra

Data 2014

  • 1. Finding Datasets and Statistics
  • 2. What is Data? da·ta noun plural but singular or plural in construction, often attributive ˈdā-tə, ˈda- also ˈdä- 1: factual information (as measurements or statistics) used as a basis for reasoning, discussion, or calculation 2: information output by a sensing device or organ that includes both useful and irrelevant or redundant information and must be processed to be meaningful 3: information in numerical form that can be digitally transmitted or processed Merriam-Webster (http://www.merriam-webster.com/dictionary/data)
  • 3. Data can be • Observational: Captured in real-time, typically outside the lab – Examples: Sensor readings, survey results, images, audio, video • Experimental: Typically generated in the lab or under controlled conditions – Examples: test results • Simulation: Machine generated from test models – Examples: climate models, economic models • Derived /Compiled: Generated from existing datasets – Examples: text and data mining, compiled database, 3D models
  • 4. Data can be • Text: field or laboratory notes, survey responses • Numeric: tables, counts, measurements • Audiovisual: images, sound recordings, video • Models, computer code, geospatial data • Discipline-specific: FITS in astronomy, CIF in chemistry • Instrument-specific: equipment outputs
  • 5. Microdata • Data directly observed or collected from a specific unit of observation. • Contain individual cases, usually individual people, or in the case of Census data, individual households Examples: • Census: the unit of observation is probably an individual, a household or a family. • Survey or poll: the responses of a single respondent
  • 6. Aggregate Data Is higher-level data that have been compiled from smaller units of data. Examples: inflation rate, consumer price index, demographic data for city or state
  • 7. Statistics are numerical data that has been organized and interpreted, usually displayed in tables.
  • 8. Datasets • A dataset or study is made up of the raw data file and any related files, usually the codebook and setup files. • Most data sets require at least basic statistical analysis (Stata, SPSS, R, etc.) or spreadsheet programs (Excel) to use.
  • 9. Repositories • A data repository is a collection of datasets that have been deposited for storage and findability. • They are often – discipline specific and/or – affiliated with a research institution • Examples – ICPSR – Harvard Dataverse Network – UC San Diego Digital Collections
  • 10. To recap: • Data are raw ingredients from which statistics are created. • Statistical analysis can be performed on data to show relationships among the variables collected. • Through secondary data analysis, many different researchers can re-use the same data set for different purposes.
  • 12. 1. Think about who might collect the data. • Could it have been collected by a government agency? • A nonprofit or nongovernmental organization? • A private business or industry group? • Academic researchers?
  • 13. 2. Look for publications that use the kind of data you’re looking for and that cite the dataset In other words, is the data you want mentioned in scholarly articles or government reports or some other source?
  • 14. 3. Once you know that what you want exists, it's time to hunt it down. • Is it freely available on the web? • Or part of a package to which the library already subscribes? • Is it something we can buy? (And is it within the library's budget and can the purchase be made quickly enough to fit your timeframe?) • Can it be requested directly from the researcher?