SlideShare a Scribd company logo
Group Cases
Search and download
www
www
eurostat and plots
.
.
wwwwww
wwwwww
www ww
wwwww
R tools to access open data
from Eurostat database
The eurostat package
Data in the Eurostat database is stored in tables. Each table has an
identifier, a short table_code, and a description (e.g. tsdtr420 - People
killed in road accidents).
Key eurostat functions allow to find the table_code, download the
eurostat table and polish labels in the table.
library("eurostat")
query <- search_eurostat("road", type = "table")
query[1:3,1:2]
## title code
## 1 Goods transport by road ttr00005
## 2 People killed in road accidents tsdtr420
## 3 Enterprises with broadband access tin00090
The get_eurostat() function returns tibbles in the long format. Packages
dplyr and tidyr are well suited to transform these objects. The ggplot2
package is well suited to plot these objects.
t1 <- get_eurostat("tsdtr420", filters =
list(geo = c("UK", "FR", "PL", "ES", "PT")))
library("ggplot2")
ggplot(t1, aes(x = time, y = values, color = geo,
group = geo, shape = geo)) +
geom_point(size = 2) +
geom_line() + theme_bw() +
labs(title="Road accidents", x = "Year", y = "Victims")
eurostat and maps
This onepager presents the eurostat package
Leo Lahti, Janne Huovari, Markus Kainu, Przemyslaw Biecek 2014-2017 package version 2.2.43 URL: https://github.com/rOpenGov/eurostat
Find the table code
Thesearch_eurostat(pattern,...)functionscansthedirectoryofEuro-
stat tables and returns codes and descriptions of tables that match
pattern.
Download the table
The get_eurostat(id, time_format = "date", filters = "none", type =
"code", cache = TRUE, ...) function downloads the requested table
from the Eurostatbulkdownloadfacility or from TheEurostatWebServi-
ces JSON API (if filters are defined). Downloaded data is cached (if
cache=TRUE). Additional arguments define how to read the time
column (time_format) and if table dimensions shall be kept as codes
or converted to labels (type).
dat <- get_eurostat(id="tsdtr420", time_format="num")
head(dat)
## unit sex geo time values
## 1 NR T AT 1999 1079
## 2 NR T BE 1999 1397
## 3 NR T CZ 1999 1455
## 4 NR T DK 1999 514
## 5 NR T EL 1999 2116
## 6 NR T ES 1999 5738
Add labels
The label_eurostat(x, lang = "en", ...) gets definitions for Eurostat
codes and replace them with labels in given language ("en", "fr" or
"de").
dat <- label_eurostat(dat)
head(dat)
## unit sex geo time values
## 1 Number Total Austria 1999 1079
## 2 Number Total Belgium 1999 1397
## 3 Number Total Czech Republic 1999 1455
## 4 Number Total Denmark 1999 514
## 5 Number Total Greece 1999 2116
## 6 Number Total Spain 1999 5738
library("dplyr")
t2 <- t1 %>% filter(time == "2014-01-01")
ggplot(t2, aes(geo, values, fill=geo)) +
geom_bar(stat = "identity") + theme_bw() +
theme(legend.position = "none")+
labs(title="Road accidents in 2014", x="", y="Victims")
There are three function to work with geospatial data from GISCO. The
get_eurostat_geospatial() returns preprocessed spatial data as sp-ob-
jects or as data frames. The merge_eurostat_geospatial() both down-
loads and merges the geospatial data with a preloaded tabular data. The
cut_to_classes() is a wrapper for cut() - function and is used for categori-
zing data for maps with tidy labels.
Fetch and process data
library("eurostat")
library("dplyr")
fertility <- get_eurostat("demo_r_frate3") %>%
filter(time == "2014-01-01") %>%
mutate(cat = cut_to_classes(values, n=7, decimals=1))
mapdata <- merge_eurostat_geodata(fertility,
resolution = "20")
head(select(mapdata,geo,values,cat,long,lat,order,id))
## geo values cat long lat order id
## 1 AT124 1.39 1.3 ~< 1.5 15.54245 48.90770 214 10
## 2 AT124 1.39 1.3 ~< 1.5 15.75363 48.85218 215 10
## 3 AT124 1.39 1.3 ~< 1.5 15.88763 48.78511 216 10
## 4 AT124 1.39 1.3 ~< 1.5 15.81535 48.69270 217 10
## 5 AT124 1.39 1.3 ~< 1.5 15.94094 48.67173 218 10
## 6 AT124 1.39 1.3 ~< 1.5 15.90833 48.59815 219 10
40
50
60
−10 0 10 20 30 40
long
lat
0.9 ~< 1.3
1.3 ~< 1.5
1.5 ~< 1.7
1.7 ~< 1.9
1.9 ~< 2.3
2.3 ~< 3.1
3.1 ~< 4.5
library("ggplot2")
ggplot(mapdata, aes(x = long, y = lat, group = group))+
geom_polygon(aes(fill=cat), color="grey", size = .1)+
scale_fill_brewer(palette = "RdYlBu") +
labs(title="Fertility rate, by NUTS-3 regions, 2014",
subtitle="Avg. number of live births per woman",
fill="Total fertility rate(%)") + theme_light()+
coord_map(xlim=c(-12,44), ylim=c(35,67))
Draw a cartogram
The object returned by merge_eurostat_geospatial() are ready to be
plotted with ggplot2 package. The coord_map() function is useful to set
the projection while labs() adds annotations o the plot.
● ●
●
● ●
●
●
●
●
●
●
●
●
●
● ●
2000
4000
6000
8000
2000 2005 2010 2015
Year
Victims
geo
● ES
FR
PL
PT
UK
Road accidents
0
1000
2000
3000
ES FR PL PT UK
Victims
Road accidents in 2014
40
50
60
−10 0 10 20 30 40
long
lat
Total fertility rate(%)
0.9 ~< 1.3
1.3 ~< 1.5
1.5 ~< 1.7
1.7 ~< 1.9
1.9 ~< 2.3
2.3 ~< 3.1
3.1 ~< 4.5
Avg. number of live births per woman
Fertility rate, by NUTS−3 regions, 2014
CC BY Przemyslaw Biecek
https://creativecommons.org/licenses/by/4.0/

More Related Content

PDF
11 1. multi-dimensional array eng
PPT
Multi dimensional arrays
PDF
QMC: Undergraduate Workshop, Tutorial on 'R' Software - Yawen Guan, Feb 26, 2...
PPT
Chap05alg
PDF
Basic R Data Manipulation
TXT
Card pack
PDF
peRm R group. Review of packages for r for market data downloading and analysis
PPT
ML: A Strongly Typed Functional Language
11 1. multi-dimensional array eng
Multi dimensional arrays
QMC: Undergraduate Workshop, Tutorial on 'R' Software - Yawen Guan, Feb 26, 2...
Chap05alg
Basic R Data Manipulation
Card pack
peRm R group. Review of packages for r for market data downloading and analysis
ML: A Strongly Typed Functional Language

What's hot (18)

DOCX
Plot3D Package and Example in R.-Data visualizat,on
PPTX
Seminar PSU 10.10.2014 mme
DOCX
imager package in R and examples..
DOCX
Basic Calculus in R.
PDF
Excel macro for solving a polynomial equation
DOCX
Array
DOCX
Advanced Data Visualization in R- Somes Examples.
PDF
Cs101 endsem 2014
PDF
C questions
PDF
Cassandra model
PDF
Numpy tutorial(final) 20160303
PDF
Python3 cheatsheet
PDF
Артём Акуляков - F# for Data Analysis
PPTX
12c Mini Lesson - Invisible Columns
PDF
Matlab differential
PPT
Chapter2
PDF
Revision1schema C programming
TXT
PYTHON PROGRAMS FOR BEGINNERS
Plot3D Package and Example in R.-Data visualizat,on
Seminar PSU 10.10.2014 mme
imager package in R and examples..
Basic Calculus in R.
Excel macro for solving a polynomial equation
Array
Advanced Data Visualization in R- Somes Examples.
Cs101 endsem 2014
C questions
Cassandra model
Numpy tutorial(final) 20160303
Python3 cheatsheet
Артём Акуляков - F# for Data Analysis
12c Mini Lesson - Invisible Columns
Matlab differential
Chapter2
Revision1schema C programming
PYTHON PROGRAMS FOR BEGINNERS
Ad

Similar to Eurostat cheatsheet (20)

PPTX
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
PPT
A gentle introduction to riese
PDF
R Visualization Assignment
PDF
R data-import, data-export
 
PDF
Big datacourse
PDF
9. R data-import data-export
PPTX
Exploratory data analysis in R - Data Science Club
PPTX
Data Exploration in R.pptx
PDF
sexy maps comes to R - ggplot+ google maps= ggmap #rstats
PDF
Spatial visualization with ggplot2
PDF
Unit---4.pdf how to gst du paper in this day and age
PPTX
Googlevis examples
PDF
Introduction into R for historians (part 3: examine and import data)
PDF
R workshop iii -- 3 hours to learn ggplot2 series
PPTX
UNIT-I-TS&F.pptx
PDF
[系列活動] Data exploration with modern R
PDF
Introduction to data.table in R
PDF
Data visualization-2.1
PPTX
R for data visualization and graphics
PDF
Grouping & Summarizing Data in R
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
A gentle introduction to riese
R Visualization Assignment
R data-import, data-export
 
Big datacourse
9. R data-import data-export
Exploratory data analysis in R - Data Science Club
Data Exploration in R.pptx
sexy maps comes to R - ggplot+ google maps= ggmap #rstats
Spatial visualization with ggplot2
Unit---4.pdf how to gst du paper in this day and age
Googlevis examples
Introduction into R for historians (part 3: examine and import data)
R workshop iii -- 3 hours to learn ggplot2 series
UNIT-I-TS&F.pptx
[系列活動] Data exploration with modern R
Introduction to data.table in R
Data visualization-2.1
R for data visualization and graphics
Grouping & Summarizing Data in R
Ad

More from Dieudonne Nahigombeye (11)

PDF
Rstudio ide-cheatsheet
PDF
Rmarkdown cheatsheet-2.0
PDF
Reg ex cheatsheet
PDF
How big-is-your-graph
PDF
Ggplot2 cheatsheet-2.1
PDF
Devtools cheatsheet
PDF
Data transformation-cheatsheet
PDF
Data import-cheatsheet
PDF
Rstudio ide-cheatsheet
Rmarkdown cheatsheet-2.0
Reg ex cheatsheet
How big-is-your-graph
Ggplot2 cheatsheet-2.1
Devtools cheatsheet
Data transformation-cheatsheet
Data import-cheatsheet

Recently uploaded (20)

PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Lecture1 pattern recognition............
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
ISS -ESG Data flows What is ESG and HowHow
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Mega Projects Data Mega Projects Data
Business Acumen Training GuidePresentation.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction-to-Cloud-ComputingFinal.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
1_Introduction to advance data techniques.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
climate analysis of Dhaka ,Banglades.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Database Infoormation System (DBIS).pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Lecture1 pattern recognition............
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Qualitative Qantitative and Mixed Methods.pptx

Eurostat cheatsheet

  • 1. Group Cases Search and download www www eurostat and plots . . wwwwww wwwwww www ww wwwww R tools to access open data from Eurostat database The eurostat package Data in the Eurostat database is stored in tables. Each table has an identifier, a short table_code, and a description (e.g. tsdtr420 - People killed in road accidents). Key eurostat functions allow to find the table_code, download the eurostat table and polish labels in the table. library("eurostat") query <- search_eurostat("road", type = "table") query[1:3,1:2] ## title code ## 1 Goods transport by road ttr00005 ## 2 People killed in road accidents tsdtr420 ## 3 Enterprises with broadband access tin00090 The get_eurostat() function returns tibbles in the long format. Packages dplyr and tidyr are well suited to transform these objects. The ggplot2 package is well suited to plot these objects. t1 <- get_eurostat("tsdtr420", filters = list(geo = c("UK", "FR", "PL", "ES", "PT"))) library("ggplot2") ggplot(t1, aes(x = time, y = values, color = geo, group = geo, shape = geo)) + geom_point(size = 2) + geom_line() + theme_bw() + labs(title="Road accidents", x = "Year", y = "Victims") eurostat and maps This onepager presents the eurostat package Leo Lahti, Janne Huovari, Markus Kainu, Przemyslaw Biecek 2014-2017 package version 2.2.43 URL: https://github.com/rOpenGov/eurostat Find the table code Thesearch_eurostat(pattern,...)functionscansthedirectoryofEuro- stat tables and returns codes and descriptions of tables that match pattern. Download the table The get_eurostat(id, time_format = "date", filters = "none", type = "code", cache = TRUE, ...) function downloads the requested table from the Eurostatbulkdownloadfacility or from TheEurostatWebServi- ces JSON API (if filters are defined). Downloaded data is cached (if cache=TRUE). Additional arguments define how to read the time column (time_format) and if table dimensions shall be kept as codes or converted to labels (type). dat <- get_eurostat(id="tsdtr420", time_format="num") head(dat) ## unit sex geo time values ## 1 NR T AT 1999 1079 ## 2 NR T BE 1999 1397 ## 3 NR T CZ 1999 1455 ## 4 NR T DK 1999 514 ## 5 NR T EL 1999 2116 ## 6 NR T ES 1999 5738 Add labels The label_eurostat(x, lang = "en", ...) gets definitions for Eurostat codes and replace them with labels in given language ("en", "fr" or "de"). dat <- label_eurostat(dat) head(dat) ## unit sex geo time values ## 1 Number Total Austria 1999 1079 ## 2 Number Total Belgium 1999 1397 ## 3 Number Total Czech Republic 1999 1455 ## 4 Number Total Denmark 1999 514 ## 5 Number Total Greece 1999 2116 ## 6 Number Total Spain 1999 5738 library("dplyr") t2 <- t1 %>% filter(time == "2014-01-01") ggplot(t2, aes(geo, values, fill=geo)) + geom_bar(stat = "identity") + theme_bw() + theme(legend.position = "none")+ labs(title="Road accidents in 2014", x="", y="Victims") There are three function to work with geospatial data from GISCO. The get_eurostat_geospatial() returns preprocessed spatial data as sp-ob- jects or as data frames. The merge_eurostat_geospatial() both down- loads and merges the geospatial data with a preloaded tabular data. The cut_to_classes() is a wrapper for cut() - function and is used for categori- zing data for maps with tidy labels. Fetch and process data library("eurostat") library("dplyr") fertility <- get_eurostat("demo_r_frate3") %>% filter(time == "2014-01-01") %>% mutate(cat = cut_to_classes(values, n=7, decimals=1)) mapdata <- merge_eurostat_geodata(fertility, resolution = "20") head(select(mapdata,geo,values,cat,long,lat,order,id)) ## geo values cat long lat order id ## 1 AT124 1.39 1.3 ~< 1.5 15.54245 48.90770 214 10 ## 2 AT124 1.39 1.3 ~< 1.5 15.75363 48.85218 215 10 ## 3 AT124 1.39 1.3 ~< 1.5 15.88763 48.78511 216 10 ## 4 AT124 1.39 1.3 ~< 1.5 15.81535 48.69270 217 10 ## 5 AT124 1.39 1.3 ~< 1.5 15.94094 48.67173 218 10 ## 6 AT124 1.39 1.3 ~< 1.5 15.90833 48.59815 219 10 40 50 60 −10 0 10 20 30 40 long lat 0.9 ~< 1.3 1.3 ~< 1.5 1.5 ~< 1.7 1.7 ~< 1.9 1.9 ~< 2.3 2.3 ~< 3.1 3.1 ~< 4.5 library("ggplot2") ggplot(mapdata, aes(x = long, y = lat, group = group))+ geom_polygon(aes(fill=cat), color="grey", size = .1)+ scale_fill_brewer(palette = "RdYlBu") + labs(title="Fertility rate, by NUTS-3 regions, 2014", subtitle="Avg. number of live births per woman", fill="Total fertility rate(%)") + theme_light()+ coord_map(xlim=c(-12,44), ylim=c(35,67)) Draw a cartogram The object returned by merge_eurostat_geospatial() are ready to be plotted with ggplot2 package. The coord_map() function is useful to set the projection while labs() adds annotations o the plot. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2000 4000 6000 8000 2000 2005 2010 2015 Year Victims geo ● ES FR PL PT UK Road accidents 0 1000 2000 3000 ES FR PL PT UK Victims Road accidents in 2014 40 50 60 −10 0 10 20 30 40 long lat Total fertility rate(%) 0.9 ~< 1.3 1.3 ~< 1.5 1.5 ~< 1.7 1.7 ~< 1.9 1.9 ~< 2.3 2.3 ~< 3.1 3.1 ~< 4.5 Avg. number of live births per woman Fertility rate, by NUTS−3 regions, 2014 CC BY Przemyslaw Biecek https://creativecommons.org/licenses/by/4.0/