SlideShare a Scribd company logo
Analyzing DocGraph in Gephi
Janos G. Hajagos
Stony Brook School of Medicine
1
NYC Open Data Meetup
June 24, 2013
DocGraph
• Based on FOIA request to CMS by Fred Trotter
• Medicare providers (more than doctors)
• CY 2011 date of service
• Share 10 or more patients in a 30 day forward
window
• Initial access restricted to MedStartr funders
but as of June 2013 open access
2
Geographic Visualization
http://isurfsoftware.com/blog/2012/12/13/visualizing-geographic-connections-between-us-doctors/
3
DocGraph by the numbers
• Directed graph
• Average total degree 52.8
• 940,492 providers (graph nodes/vertices)
• 49,685,810 shared edges
4
DocGraph Data
5
6
NPPES
• National Plan and Provider Enumeration
System
• Source of NPI (National Provider Identifier)
• Information is entered and updated by
provider
• CSV file with 314 columns
• MySQL load script generated by Python script
to normalize database
7
Selecting a sub-graph
8
Core nodes
9
Leaf nodes
10
Core-to-core edges
11
Core-to-leaf edges
12
Leaf-to-leaf edges
13
Generating GraphML
• XML based file format for graphs
• Readable by a large number of tools
– Gephi
– Mathematica
– igraph (R)
• NetworkX Python library for graphs can easily
export to GraphML
14
15
16
Gephi
Subset defined from 2 Brooklyn zip codes
(11215 - Park Slope & 11212 - Brownsville)
17
Links
http://strata.oreilly.com/2012/11/docgraph-open-
social-doctor-data.html (information)
https://github.com/jhajagos/DocGraph (code)
https://github.com/ftrotter/DocGraph (data)
https://groups.google.com/forum/#!forum/docgrap
h (mailing list)
http://bit.ly/1459NXn (sample Brooklyn GraphML
file)
http://strataconf.com/rx2013/public/schedule/deta
il/29840 (StrataRX workshop with Fred Trotter)
18

More Related Content

PDF
Introduction to CTIM - the Clinical Trial Information Mediator
PDF
Open science and medical evidence generation - Kees van Bochove - The Hyve
PPTX
The Realities of Research Data Management
PPTX
Clinical Data Models - The Hyve - Bio IT World April 2019
PPTX
Communicating with Data 2010 Annual Meeting
PPTX
International perspective for sharing publicly funded medical research data
PDF
Building a National Data Infrastructure to Advance Patient-Centered Comparati...
PDF
Gephi Consortium Presentation
Introduction to CTIM - the Clinical Trial Information Mediator
Open science and medical evidence generation - Kees van Bochove - The Hyve
The Realities of Research Data Management
Clinical Data Models - The Hyve - Bio IT World April 2019
Communicating with Data 2010 Annual Meeting
International perspective for sharing publicly funded medical research data
Building a National Data Infrastructure to Advance Patient-Centered Comparati...
Gephi Consortium Presentation

Similar to Visualizing doc graph in gephi june 2013 (18)

PDF
Introduction to Doctor Social Graph Project
PPTX
When Graphs Meet Machine Learning
PPTX
Introduction to Network Analysis in Gephi
PPTX
Graphium Chrysalis: Exploiting Graph Database
PPT
What is Graph Database
PDF
Comm645 gephi handout
PDF
Data Summer Conf 2018, “Analysing Billion Node Graphs (ENG)” — Giorgi Jvaridz...
PPT
Visualize your Twitter network
PPTX
Graphs in data structures are non-linear data structures made up of a finite ...
PDF
Gephi Toolkit Tutorial
PDF
Big social data analytics - social network analysis
PPTX
Getting your hands on graphs
PDF
SP1: Exploratory Network Analysis with Gephi
PDF
Gephi icwsm-tutorial
PPTX
Social Network Analysis Introduction including Data Structure Graph overview.
PDF
Graph Analyses with Python and NetworkX
PDF
How To Visualize Graphs
PDF
Visualising Research Graph using Neo4j and Gephi
Introduction to Doctor Social Graph Project
When Graphs Meet Machine Learning
Introduction to Network Analysis in Gephi
Graphium Chrysalis: Exploiting Graph Database
What is Graph Database
Comm645 gephi handout
Data Summer Conf 2018, “Analysing Billion Node Graphs (ENG)” — Giorgi Jvaridz...
Visualize your Twitter network
Graphs in data structures are non-linear data structures made up of a finite ...
Gephi Toolkit Tutorial
Big social data analytics - social network analysis
Getting your hands on graphs
SP1: Exploratory Network Analysis with Gephi
Gephi icwsm-tutorial
Social Network Analysis Introduction including Data Structure Graph overview.
Graph Analyses with Python and NetworkX
How To Visualize Graphs
Visualising Research Graph using Neo4j and Gephi
Ad

More from Vivian S. Zhang (20)

PDF
Why NYC DSA.pdf
PPTX
Career services workshop- Roger Ren
PDF
Nycdsa wordpress guide book
PDF
We're so skewed_presentation
PDF
Wikipedia: Tuned Predictions on Big Data
PDF
A Hybrid Recommender with Yelp Challenge Data
PDF
Kaggle Top1% Solution: Predicting Housing Prices in Moscow
PDF
Data mining with caret package
PDF
PPTX
Streaming Python on Hadoop
PDF
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
PDF
PDF
Nyc open-data-2015-andvanced-sklearn-expanded
PDF
Nycdsa ml conference slides march 2015
PDF
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
PDF
Max Kuhn's talk on R machine learning
PDF
Winning data science competitions, presented by Owen Zhang
PDF
Using Machine Learning to aid Journalism at the New York Times
PDF
Introducing natural language processing(NLP) with r
PDF
Bayesian models in r
Why NYC DSA.pdf
Career services workshop- Roger Ren
Nycdsa wordpress guide book
We're so skewed_presentation
Wikipedia: Tuned Predictions on Big Data
A Hybrid Recommender with Yelp Challenge Data
Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Data mining with caret package
Streaming Python on Hadoop
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Nyc open-data-2015-andvanced-sklearn-expanded
Nycdsa ml conference slides march 2015
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
Max Kuhn's talk on R machine learning
Winning data science competitions, presented by Owen Zhang
Using Machine Learning to aid Journalism at the New York Times
Introducing natural language processing(NLP) with r
Bayesian models in r
Ad

Recently uploaded (20)

PPTX
master seminar digital applications in india
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
A systematic review of self-coping strategies used by university students to ...
PPTX
Cell Structure & Organelles in detailed.
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Cell Types and Its function , kingdom of life
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
Presentation on HIE in infants and its manifestations
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
master seminar digital applications in india
O5-L3 Freight Transport Ops (International) V1.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
A systematic review of self-coping strategies used by university students to ...
Cell Structure & Organelles in detailed.
human mycosis Human fungal infections are called human mycosis..pptx
Module 4: Burden of Disease Tutorial Slides S2 2025
Chinmaya Tiranga quiz Grand Finale.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Pharmacology of Heart Failure /Pharmacotherapy of CHF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Computing-Curriculum for Schools in Ghana
Cell Types and Its function , kingdom of life
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Final Presentation General Medicine 03-08-2024.pptx
GDM (1) (1).pptx small presentation for students
Presentation on HIE in infants and its manifestations
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape

Visualizing doc graph in gephi june 2013