2016-10-17		|		UC	Berkeley	 Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	 74	
Related Tools & Tips
Data Work Flow
•  Workflow	of	data	analysis	
•  An	organized,	well-documented,	step-by-step	process	from	design	to	publica)on	
•  Basic	steps:	
•  Data	collec)on/organiza)on/cleaning	
•  Analyses	
•  Dissemina)on/publica)on	
•  Data/materials	storage	
•  Facilitates	“easy”	replica)on	
•  Can	use	GitHub	to	track		
changes	to	code	in	workflow	
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
75	
hkps://www.dezyre.com/ar)cle/data-analysis-workflow-with-r-packages/259
Data & Code
Organiza(on/Storage:
Recommenda(ons
•  Use	annotated	text	files	for	your	code	(or	similar	for	other	programs)	
•  Both	for	self,	colleagues,	and	replica)on	
•  For	example	
•  Do	Files	in	Stata	
•  R	Markdown	
•  GitHub	(www.github.com)	
•  Transparently	report	&	share	your	code	
•  Use	for	collabora)on	&	version	control	
•  Can	link	to	OSF	
•  Other	)ps	
•  Use	coding	loops	(vs	copy-paste)	
•  Use	func)ons/variables	for	constants	(in	case	need	to	change	later)	
•  Ideally,	once	finish	analysis,	have	a	colleague	run	analysis	using	different	sojware	
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
76
Cita(on Management
• Many	op)ons…	
• Pros	&	Cons	
•  Cost	(one	)me,	annual?)	
•  Offline	or	online?	
•  Compatability	
•  Flexiblity	(eg,	for	SRs)	
• Also,	new-ish:	PaperPile	(useful	for	online	collabora)ons,	poten)al	SR	issue)	
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
77	
hkp://guides.library.upenn.edu/cita)onmgmt
Ethics & IRB Process
•  Commiaee	for	ProtecNon	of	Human	Subjects	(CPHS)	
•  UC	Berkeley’s	InsNtuNonal	Review	Board	(IRB)	[actually	two	of	them]	
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
78	
hkp://cphs.berkeley.edu/about.html	
“The	primary	mission	of	the	IRB	is	to	ensure	the	protec)on	of	the	rights	and	welfare	of	all	human	
par)cipants	in	research	conducted	by	university	faculty,	staff	and	students.”	
hkp://cphs.berkeley.edu/about.html
Some Berkeley-
related Resources
2016-10-17		|		UC	Berkeley	
Alasdair	Cohen		|		Lecture	for	Publich	Health	250B	
79

More Related Content

PPTX
Dataset Descriptions in Open PHACTS and HCLS
PPTX
R Then and Now
PPTX
America Runs on Excel and HDF5 - Glued together by Python
PPTX
Site story wadl2013
PPTX
R training at Aimia
PPTX
R reproducibility
PPTX
KD-2013-Optimizing-Document-Search-using-Lucene
Dataset Descriptions in Open PHACTS and HCLS
R Then and Now
America Runs on Excel and HDF5 - Glued together by Python
Site story wadl2013
R training at Aimia
R reproducibility
KD-2013-Optimizing-Document-Search-using-Lucene

What's hot (15)

PPTX
Reproducible Data Science with R
PDF
Briney - Leveling Up Data Management - With Notes
PDF
ICIC 2013 New Product Introductions ChemAxon
PDF
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
PDF
A Framework for Multi-source Studies based on Unstructured Data.
PDF
Industrializing Machine learning pipelines
PDF
Data Analysis and Visualization: R Workflow
PDF
PharoDAYS 2015: Pharo Status - by Markus Denker
PPTX
DSpace-CRIS: new features and contribution to the DSpace mainstream
PDF
Improving data interoperability in Python and R
PPTX
Code4 lib 20141129 python
PDF
4Science presentes: ORCiD API Tutorial
PDF
A new R package for analysing TIMES data
PPTX
Intro to Reproducible Research
PDF
Growing with elastic search
Reproducible Data Science with R
Briney - Leveling Up Data Management - With Notes
ICIC 2013 New Product Introductions ChemAxon
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
A Framework for Multi-source Studies based on Unstructured Data.
Industrializing Machine learning pipelines
Data Analysis and Visualization: R Workflow
PharoDAYS 2015: Pharo Status - by Markus Denker
DSpace-CRIS: new features and contribution to the DSpace mainstream
Improving data interoperability in Python and R
Code4 lib 20141129 python
4Science presentes: ORCiD API Tutorial
A new R package for analysing TIMES data
Intro to Reproducible Research
Growing with elastic search
Ad

Viewers also liked (6)

PDF
Transparency4
 
PDF
Transparency7
 
PDF
Transparency5
 
PDF
Transparency1
 
PDF
Transparency3
 
PDF
Transparency2
 
Transparency4
 
Transparency7
 
Transparency5
 
Transparency1
 
Transparency3
 
Transparency2
 
Ad

Similar to Transparency6 (20)

PPTX
Analysing GitHub commits with R
PPTX
Reproducible research concepts and tools
PPTX
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
PPTX
The Power of Azure DevOps
PPTX
The Power of Azure DevOps
PPTX
Efficient & effective data management for research projects : ILRI's Data Ma...
PDF
Intro to Machine Learning with H2O and AWS
PDF
Code the docs-yu liu
PDF
Scalable Machine Learning in R and Python with H2O
PPTX
Docs as Part of the Product - Open Source Summit North America 2018
PPTX
The Power of Azure DevOps
PDF
2019-04-17 Bio-IT World G Suite-Jira Cloud Sample Tracking
PPTX
Managing changes to eZPublish Database
PDF
Managing Changes to the Database Across the Project Life Cycle (presented by ...
PPTX
DATA Pass
PDF
FOSDEM '18 - Tools for large scale collection and analysis of source code re...
PPTX
An Introduction to Clinical Study Migrations
PDF
Agile Secure Cloud Application Development Management
PDF
G3 talk rld_2
PDF
Streamlining database provisioning with DevOps.pdf
Analysing GitHub commits with R
Reproducible research concepts and tools
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
The Power of Azure DevOps
The Power of Azure DevOps
Efficient & effective data management for research projects : ILRI's Data Ma...
Intro to Machine Learning with H2O and AWS
Code the docs-yu liu
Scalable Machine Learning in R and Python with H2O
Docs as Part of the Product - Open Source Summit North America 2018
The Power of Azure DevOps
2019-04-17 Bio-IT World G Suite-Jira Cloud Sample Tracking
Managing changes to eZPublish Database
Managing Changes to the Database Across the Project Life Cycle (presented by ...
DATA Pass
FOSDEM '18 - Tools for large scale collection and analysis of source code re...
An Introduction to Clinical Study Migrations
Agile Secure Cloud Application Development Management
G3 talk rld_2
Streamlining database provisioning with DevOps.pdf

More from A M (20)

PPTX
5.3.5 causal inference in research
 
PPTX
5.3.4 reporting em
 
PPTX
5.3.3 potential outcomes em
 
PPTX
5.3.2 sufficient cause em
 
PPTX
5.3.1 causal em
 
PPTX
5.2.3 dags for selection bias
 
PPTX
5.2.2 dags for confounding
 
PPTX
5.1.3 hills criteria
 
PPTX
5.1.2 counterfactual framework
 
PPTX
5.1.1 sufficient component cause model
 
PPTX
5.2.1 dags
 
PPTX
4.4. effect modification
 
PPTX
4.5. logistic regression
 
PPTX
4.3.2. controlling confounding stratification
 
PPTX
4.3.1. controlling confounding matching
 
PPTX
4.2.4. confounding counterfactual
 
PPTX
4.2.3. confounding collapsability
 
PPTX
4.2.2. confounding classical approach
 
PPTX
4.2.1. confounding mixing of effects
 
PPTX
4.1. introduction
 
5.3.5 causal inference in research
 
5.3.4 reporting em
 
5.3.3 potential outcomes em
 
5.3.2 sufficient cause em
 
5.3.1 causal em
 
5.2.3 dags for selection bias
 
5.2.2 dags for confounding
 
5.1.3 hills criteria
 
5.1.2 counterfactual framework
 
5.1.1 sufficient component cause model
 
5.2.1 dags
 
4.4. effect modification
 
4.5. logistic regression
 
4.3.2. controlling confounding stratification
 
4.3.1. controlling confounding matching
 
4.2.4. confounding counterfactual
 
4.2.3. confounding collapsability
 
4.2.2. confounding classical approach
 
4.2.1. confounding mixing of effects
 
4.1. introduction
 

Recently uploaded (20)

PPT
Predictive modeling basics in data cleaning process
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PDF
Microsoft Core Cloud Services powerpoint
PPTX
modul_python (1).pptx for professional and student
PPTX
CYBER SECURITY the Next Warefare Tactics
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
A Complete Guide to Streamlining Business Processes
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PDF
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
PPTX
Steganography Project Steganography Project .pptx
PDF
Global Data and Analytics Market Outlook Report
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PPTX
IMPACT OF LANDSLIDE.....................
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
SAP 2 completion done . PRESENTATION.pptx
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
Business_Capability_Map_Collection__pptx
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
Predictive modeling basics in data cleaning process
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
Microsoft Core Cloud Services powerpoint
modul_python (1).pptx for professional and student
CYBER SECURITY the Next Warefare Tactics
STERILIZATION AND DISINFECTION-1.ppthhhbx
A Complete Guide to Streamlining Business Processes
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
Steganography Project Steganography Project .pptx
Global Data and Analytics Market Outlook Report
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
IMPACT OF LANDSLIDE.....................
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
SAP 2 completion done . PRESENTATION.pptx
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Business_Capability_Map_Collection__pptx
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...

Transparency6

  • 2. Data Work Flow •  Workflow of data analysis •  An organized, well-documented, step-by-step process from design to publica)on •  Basic steps: •  Data collec)on/organiza)on/cleaning •  Analyses •  Dissemina)on/publica)on •  Data/materials storage •  Facilitates “easy” replica)on •  Can use GitHub to track changes to code in workflow 2016-10-17 | UC Berkeley Alasdair Cohen | Lecture for Publich Health 250B 75 hkps://www.dezyre.com/ar)cle/data-analysis-workflow-with-r-packages/259
  • 3. Data & Code Organiza(on/Storage: Recommenda(ons •  Use annotated text files for your code (or similar for other programs) •  Both for self, colleagues, and replica)on •  For example •  Do Files in Stata •  R Markdown •  GitHub (www.github.com) •  Transparently report & share your code •  Use for collabora)on & version control •  Can link to OSF •  Other )ps •  Use coding loops (vs copy-paste) •  Use func)ons/variables for constants (in case need to change later) •  Ideally, once finish analysis, have a colleague run analysis using different sojware 2016-10-17 | UC Berkeley Alasdair Cohen | Lecture for Publich Health 250B 76
  • 4. Cita(on Management • Many op)ons… • Pros & Cons •  Cost (one )me, annual?) •  Offline or online? •  Compatability •  Flexiblity (eg, for SRs) • Also, new-ish: PaperPile (useful for online collabora)ons, poten)al SR issue) 2016-10-17 | UC Berkeley Alasdair Cohen | Lecture for Publich Health 250B 77 hkp://guides.library.upenn.edu/cita)onmgmt
  • 5. Ethics & IRB Process •  Commiaee for ProtecNon of Human Subjects (CPHS) •  UC Berkeley’s InsNtuNonal Review Board (IRB) [actually two of them] 2016-10-17 | UC Berkeley Alasdair Cohen | Lecture for Publich Health 250B 78 hkp://cphs.berkeley.edu/about.html “The primary mission of the IRB is to ensure the protec)on of the rights and welfare of all human par)cipants in research conducted by university faculty, staff and students.” hkp://cphs.berkeley.edu/about.html