SlideShare a Scribd company logo
a
D

ci
S
ta

i
W
ce
en
~

R
th

r~
fo

va
Ja

rs
pe
elo
ev
D

S
@

ak
_M
er
nd
a
Data Science

1 0
1

0

a

1

0

1

1

The R language

end

0

Ag

0
1

10

1

Gimme some Java!
90% of the world’s data was
produced in the last 2 years
- SINTEF/ScienceDaily June 2013!!!!!!!!

We need more than
just CRUD
Stand back.

I know Data Science!
Hacking
Skills

Math &
Statistics

Machine
Learning

Data
Science
Danger!
Perl ahead!

Operations
Research

Domain
Expertise
Hacking
Skills

Math &
Statistics

Machine
Learning

Data
Science
Danger!
Perl ahead!

Operations
Research

Domain
Expertise
Data Science:
Achievement Unlocked
Data Science:
Achievement Unlocked

To
da
y

R, R-Studio
Ag

end

a

Data Science

1
0

1

0

The R language

0
1

0

1

1

0

11
0

1

Gimme some Java!
Language
Designers?

Statisticians?
Language
Designers?

Statisticians?

The best thing about R is that it was developed by statisticians. The
worst thing about R is that... it was developed by statisticians.
- Bo Cowgill, Google
Why R, then?
De-facto standard (in statistical research)
Open Source
Interactive data exploration

“It’s a DSL posing as general purpose language”
Why not R, then?
Slow
Memory Bound
Try googling for R...
(Did I mention it’s a quirky language?)
Why not R, then?
Slow
Memory Bound
Try googling for R...
(Did I mention it’s a quirky language?)
‘If you are using R and you think
you’re in hell, this is a map for you.’
- The R Inferno
Apparently, statisticians aren’t designers, either...
VS
Functional/OO/Procedural
Dynamic (eval)
Interpreted

OO

Static types
Compiled
numeric
character
Factor

Integer/Double/...
String

Enum
vector
list
dataframe
numeric
character
Factor

Integer/Double/...
String

Enum
1

2

3

4

1-based

0

1

2

3

0-based
higher-order functions
sapply(vec, function(elm) {
elm + 1;
})

1

2

3

4

1-based

0

1

2

3

0-based

for-loops
Studio
Studio

Comprehensive
R
Archive
Network

Central
Coding time!
Titanic Competition:
Machine Learning from Disaster
Titanic Competition:
Machine Learning from Disaster

Survived?
Titanic Competition:
Machine Learning from Disaster
Decision Tree
Sex == Female

Age > 16

Age > 50

Fare > 100
T

T

F

T

F
Titanic Competition:
Machine Learning from Disaster
Decision Tree

Random Forest

Sex == Female

Age > 16

Age > 50

T
T T

F

T

T

Fare > 100

F
F

T

F

T
T

T

F

T

F

T

T
T

F

T

F
F

T

F
Demo time!
.
.
.

.
.
.
Data Science

a
1

0
1
1

0

The R language

end

0

Ag

0

11
0

1

Gimme some Java!

1

0

1
Bridging R and Java

Integrate
Assimilate
Replace
rJava & Java/R interface

Integrate

Two way native interface
- JNI: libjri
- or TCP to RServe
Rengine re = new Rengine(new String[] {}, false, null);
// wait until engine is ready
if (!re.waitForR()) {
throw new IllegalStateException(“Can’t load R engine”);
}
re.eval("data(cars)", false);
REXP cars = re.eval("cars");
RVector carsVector = cars.asVector();
// dissect carsVector...
Assimilate

Reimplementation of R on JVM

Fast & lean
Parallelized
Just-another-lib
... not production ready yet...
Assimilate

Reimplementation of R on JVM

Fast & lean
Parallelized
Just-another-lib
... not production ready yet...

// create a script engine manager
ScriptEngineManager factory =
new ScriptEngineManager();
// create an R engine
ScriptEngine engine =
factory.getEngineByName("Renjin");
// load package from classpath
engine.eval(“library(survey)");
// evaluate R code from String
engine.eval("print('Hello from R')");
Big Data?
JVM Libraries/platforms

Replace
Scalable R distributions
(non-JVM)

Replace

Revolution Analytics
Oracle Enterprise R
Wr
apup
Data Science

1
0

The R language

0

1
1

1 0

Gimme some Java!

1

0

1
0

11
0
Sanitize
Explore

Model

Predict
Scale
Next steps

Install R

Read

Computing for Data Analysis
starts Jan. 6th 2014
Qu
esti
ons
?
Data Science
The R language @Sander_Mak

0

0
1

Gimme some Java!

0 1
1
11
0
00 1 1

1

branchandbound.net

More Related Content

PDF
A Multi-theory Logic Language for the World Wide Web
KEY
PyCon JP 2011 Lightning Talk No.10
PDF
Knowing your garbage collector - PyCon Italy 2015
PPTX
Combining R With Java For Data Analysis (Devoxx UK 2015 Session)
PDF
PPTX
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
PPTX
Data Stream Algorithms in Storm and R
A Multi-theory Logic Language for the World Wide Web
PyCon JP 2011 Lightning Talk No.10
Knowing your garbage collector - PyCon Italy 2015
Combining R With Java For Data Analysis (Devoxx UK 2015 Session)
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
Data Stream Algorithms in Storm and R

Viewers also liked (7)

PPTX
Why R? A Brief Introduction to the Open Source Statistics Platform
PPTX
R programming
PPTX
R language tutorial
PPTX
An Interactive Introduction To R (Programming Language For Statistics)
PPTX
Data science concept by Raj Krishna Paul
PPTX
Faster Java EE Builds with Gradle
PDF
How to Become a Thought Leader in Your Niche
Why R? A Brief Introduction to the Open Source Statistics Platform
R programming
R language tutorial
An Interactive Introduction To R (Programming Language For Statistics)
Data science concept by Raj Krishna Paul
Faster Java EE Builds with Gradle
How to Become a Thought Leader in Your Niche
Ad

Similar to Data Science with R for Java Developers (20)

PPTX
R programming language
PDF
Python vs. r for data science
PDF
UNIT-4 Start Learning R and installation .pdf
PDF
UNIT-1 Start Learning R.pdf
PDF
2 it unit-1 start learning r
PPTX
LSESU a Taste of R Language Workshop
PDF
Machine Learning - Intro
PDF
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
PPTX
Big data analytics with R tool.pptx
PPTX
Reason To learn & use r
PDF
In-Database Analytics Deep Dive with Teradata and Revolution
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
R as supporting tool for analytics and simulation
PPTX
R_L1-Aug-2022.pptx
PDF
GET STARTED WITH R FOR DATA SCIENCE
PDF
R tutorial
PPTX
Introduction to R
PDF
Introduction to R ajay Ohri
PDF
Have you met Julia?
PPTX
Statistics vs machine learning
R programming language
Python vs. r for data science
UNIT-4 Start Learning R and installation .pdf
UNIT-1 Start Learning R.pdf
2 it unit-1 start learning r
LSESU a Taste of R Language Workshop
Machine Learning - Intro
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
Big data analytics with R tool.pptx
Reason To learn & use r
In-Database Analytics Deep Dive with Teradata and Revolution
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
R as supporting tool for analytics and simulation
R_L1-Aug-2022.pptx
GET STARTED WITH R FOR DATA SCIENCE
R tutorial
Introduction to R
Introduction to R ajay Ohri
Have you met Julia?
Statistics vs machine learning
Ad

More from NLJUG (20)

PPTX
The future of Web-Scale - Johan Tillema, Rene Boere & Chris Quach
PPTX
Speedy perception trumps speedy reception–smart asynchronous interactions - L...
PDF
Decoding the airspace above you with Java and $7 hardware - Bert Jan Schrijver
PPTX
Using Docker to Develop, Test and Run Maven Projects - Wouter Danes
PPTX
Kill the mutants and test your tests - Roy van Rijn
PDF
Real-time user interfaces - sosm gewoon makkelijker - Allard Buijze
PPTX
The end of traditional enterprise IT - ING's journey to the next generation I...
PDF
Performance van Java 8 en verder - Jeroen Borgers
PDF
Introduction to Reactive with Play and Akka - Markus Jura
PPTX
Web-scale op basis van Hadoop en Akka Reactive Streams - Johan Tillema, Rene ...
PDF
Workshop angular dart presentatie - Atos
PDF
Workshop spring boot presentatie - Atos
PDF
Cultivating the jenkins job jungle with groovy - Patrick van Dissel
PDF
Rethink your architecture - Marten Deinum
PPTX
Evolutionary Algorithms: the key to solving complex Java puzzles! - Bas knopper
PDF
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
PDF
Apache Wicket: 10 jaar en verder - Martijn Dashorst
PDF
Opening - Bert Ertman
PDF
Returning the right results - Jettro Coenradie
PDF
Reactive programming met Java 8 en Java EE 7 - Martijn Blankestijn
The future of Web-Scale - Johan Tillema, Rene Boere & Chris Quach
Speedy perception trumps speedy reception–smart asynchronous interactions - L...
Decoding the airspace above you with Java and $7 hardware - Bert Jan Schrijver
Using Docker to Develop, Test and Run Maven Projects - Wouter Danes
Kill the mutants and test your tests - Roy van Rijn
Real-time user interfaces - sosm gewoon makkelijker - Allard Buijze
The end of traditional enterprise IT - ING's journey to the next generation I...
Performance van Java 8 en verder - Jeroen Borgers
Introduction to Reactive with Play and Akka - Markus Jura
Web-scale op basis van Hadoop en Akka Reactive Streams - Johan Tillema, Rene ...
Workshop angular dart presentatie - Atos
Workshop spring boot presentatie - Atos
Cultivating the jenkins job jungle with groovy - Patrick van Dissel
Rethink your architecture - Marten Deinum
Evolutionary Algorithms: the key to solving complex Java puzzles! - Bas knopper
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
Apache Wicket: 10 jaar en verder - Martijn Dashorst
Opening - Bert Ertman
Returning the right results - Jettro Coenradie
Reactive programming met Java 8 en Java EE 7 - Martijn Blankestijn

Recently uploaded (20)

PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PPTX
Tartificialntelligence_presentation.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
August Patch Tuesday
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
WOOl fibre morphology and structure.pdf for textiles
PPTX
observCloud-Native Containerability and monitoring.pptx
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
project resource management chapter-09.pdf
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Zenith AI: Advanced Artificial Intelligence
PPT
What is a Computer? Input Devices /output devices
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
NewMind AI Weekly Chronicles - August'25-Week II
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Tartificialntelligence_presentation.pptx
Assigned Numbers - 2025 - Bluetooth® Document
August Patch Tuesday
O2C Customer Invoices to Receipt V15A.pptx
1 - Historical Antecedents, Social Consideration.pdf
WOOl fibre morphology and structure.pdf for textiles
observCloud-Native Containerability and monitoring.pptx
A comparative study of natural language inference in Swahili using monolingua...
Enhancing emotion recognition model for a student engagement use case through...
project resource management chapter-09.pdf
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Zenith AI: Advanced Artificial Intelligence
What is a Computer? Input Devices /output devices
A contest of sentiment analysis: k-nearest neighbor versus neural network
Programs and apps: productivity, graphics, security and other tools
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf

Data Science with R for Java Developers