SlideShare a Scribd company logo
@ctjava#r+java
Combining R with Java
Ryan Cuprak
Elsa Cuprak
@ctjava
cuprak.info
@ctjava#r+java
Combining R with Java
@ctjava#r+java
Agenda
R
Overview
R + Java
R + Java
EE
@ctjava#r+java
What is R?
• Free open-source alternative to Matlab, SAS, Excel, and SPSS
• R is:
• Statistical software
• Language
• Environment
• Ecosystem
• Used by Google, Facebook, Bank of America, etc.
• 2 million users worldwide
• Downloaded URL:
http://www.r-project.org
@ctjava#r+java
What is R?
• R Foundation responsible for R.
• Sponsored/supported by industry.
• Licensed under GPL.
• Implementation of the S programming language
• Name derived from author’s of R.
• First implementation ~1997
• Written in C, Fortran, and R
@ctjava#r+java
CRAN
• Power of R is packages!
• CRAN = Comprehensive R Archive Network
• Analogous to (Maven) Central
• 6745 packages available
• Database access
• Data manipulation
• Visualization
• Data modeling
• Reports
• Geospatial data analysis
• Time series/financial data
@ctjava#r+java
CRAN Popular Packages
• ggplot2 – package for creating graphs
• rgl – interactive 3D visualizations
• Caret – training regression
• Survival – tools for survival analysis
• Mgcv – generalized additive models
• Maps – polygons for plots
• Ggmap – Google maps
• Xts – manipulates time series data
• Quantmode – downloads financial data, plotting, charting
• tidyr – changes layout of datasets
@ctjava#r+java
Uses of R
Calculating Credit Risk
Reporting
Data Analysis Data
Visualization
Data Exploration
Clinical Research
Flood
ForecastingServer Failure
Modeling
@ctjava#r+java
Why not Java?
• Java isn’t “convenient”
• Lacks specialized data structures
• Limited graphing capabilities
• Few statistical libraries available
• Statisticians don’t use Java
• No interactive tools for data exploration
• No built-in support for data import/cleanup
• Re-inventing the wheel is expensive…
R is a DSL + Stat
Library
@ctjava#r+java
Leveraging R from Java
• Two approaches to integration:
• rJava – access R from Java
• JRI – call Java from R
• rJava includes JRI.
• Installed from CRAN: install.packages(‘rJava’)
• Documentation & code:
• http://www.rforge.net/rJava/
• https://github.com/s-u/rJava
• R & Java worlds bridged via JNI
@ctjava#r+java
Getting Started with R
• Download and install:
• R
http://www.r-project.org
• R Studio:
http://www.rstudio.com
@ctjava#r+java
Basics of R
• Interpreted language
• Functional
• Dynamic typing
• Lexical scoping
• R scripts stored in “.R” files
• Run R commands interactively in R/R Studio or RScript.
• Language
• Object-oriented
• Exceptions
• Debugging
@ctjava#r+java
R Data Types
• Scalar
• Numeric
• Decimal
• Integer
• Character
• Logical – true or false
• Vectors – a sequence of numbers or characters, or higher-dimensional
arrays like matrices
• Factors – sequence assigning a category to each index
• Lists – collection of objects
• Data frames – table-like structure
@ctjava#r+java
NULL & NA
• NULL – indicates an object is absent
• NA – missing values (Not Available)
@ctjava#r+java
Language Basics
• # Comments
• Assignment “<-” but “=“ can also be used
• Variables rules:
• Letters, numbers, dot (.), underscore (_)
• Can start with a letter or a dot but not followed by a number
• Valid
.test
_test
test
test.today
• Invalid
.2test
_test
_2test
@ctjava#r+java
Vectors
• Defining and assigning a vector:
> x <- c(10,20,30,40,50,60)
• Multiplying a vector:
> x * 3
[1] 30 , 60, 90, 120, 150, 180
• Applying a function to a vector:
> sqrt(x)
[1] 3.162278 4.472136 5.477226 6.324555 7.071068…
• Access individual elements:
> x[1]
[1] 30
• Appending data to a vector:
> x <- c(x,70)
[1] 10 20 30 40 50 60 70
@ctjava#r+java
Data Frames
• Setup the data for the frame:
boats <- c("Bayou Blue", "Pachyderm", "Spectre" , "Flatline")
model <- c("J30" , "Frers 33", "J-125" , "Evelyn 32-2")
phrf <- c(135, 108 , -6, 99)
finish <- times(c( "19:53:06" , "19:42:18" , "19:38:11" , "19:45:48" ))
kts <- c(4.09 , 4.66 , 4.92 , 4.46)
• Construct the data frame:
raceDF <- data.frame(boats,model,phrf,finish,kts)
@ctjava#r+java
Data Frames
> summary(raceDF)
boats model phrf finish kts
Bayou Blue:1 Evelyn 32-2:1 Min. : -6.00 Min. :19:38:11 Min. :4.090
Flatline :1 Frers 33 :1 1st Qu.: 72.75 1st Qu.:19:41:16 1st Qu.:4.367
Pachyderm :1 J-125 :1 Median :103.50 Median :19:44:03 Median :4.560
Spectre :1 J30 :1 Mean : 84.00 Mean :19:44:51 Mean :4.532
3rd Qu.:114.75 3rd Qu.:19:47:37 3rd Qu.:4.725
Max. :135.00 Max. :19:53:06 Max. :4.920
@ctjava#r+java
Lists
• Generic Vector containing other objects
• Example:
wkDays <- c("Monday","Tuesday","Wednesday","Thursday","Friday")
dts <- c(15,16,17,18,19)
devoxx <- c(FALSE,FALSE,TRUE,TRUE,TRUE)
weekSch <- list(wkDays,dts,devoxx)
@ctjava#r+java
Lists
• Member slicing:
> weekSch[1]
[[1]]
[1] "Monday" "Tuesday" "Wednesday" "Thursday" "Friday"
• Member referencing:
> weekSch[[1]]
[1] "Monday" "Tuesday" "Wednesday" "Thursday" "Friday”
• Labeling entries:
> names(weekSch) <- c("Days","Dates","Devoxx Events")
@ctjava#r+java
Matrices
• Defining a matrix:
myMatrix <- matrix(1:10 , nrow = 2)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
• Printing out dimensions:
> dim(myMatrix)
[1] 2 5
• Multiplying matrixes:
> myMatrix + myMatrix
[,1] [,2] [,3] [,4] [,5]
[1,] 2 6 10 14 18
[2,] 4 8 12 16 20
@ctjava#r+java
Factors
• Vector whose elements can take on one of a specific set of values.
• Used in statistical modeling to assign the correct number of degrees of
freedom.
> factor(x=c("High School","College","Masters","Doctorate"),
levels=c("High School","College","Masters","Doctorate"),
ordered=TRUE)
[1] High School College Masters Doctorate
Levels: High School < College < Masters < Doctorate
@ctjava#r+java
Defining Functions
• Created using function() directive.
• Stored as objects of class function.
F <- function(<arguments>) {
# do something
}
• Functions can be passed as arguments.
• Functions can be nested in other functions.
• Return value is the last expression to be evaluated.
• Functions can take an arbitrary number of arguments.
• Example:
double.num <- function(x) {
x * 2
}
@ctjava#r+java
Built-in Datasets
data()
@YourTwitterHandle@ctjava#r+java
@ctjava#r+java
Review: Linear Regression
Linear regression model: a type of regression model, in which the response
is continuous variable, and is linearly related with the predictor
v a r i a b l e ( s ) .
@ctjava#r+java
Review: Linear Regression
What can a linear regression do?
• Find linear relationship between height and weight.
• Predict a person's weight based on his/ her height.
Example:
Given the observations, weight (Y) and height (X), the parameters in
the model can be estimated.
response intercept coefficient
predictor
error
Assumptions of the linear regression model:
1) the errors have constant variance
2) the errors have zero mean
3) the errors come from the same normal distribution
@ctjava#r+java
Review: Linear Regression
@ctjava#r+java
Review: Linear Regression
@ctjava#r+java
Review: Linear Regression
Setup the data…
@ctjava#r+java
Review: Linear Regression
Perform the linear regression…
@ctjava#r+java
Review: Linear Regression
Plot the results…
@ctjava#r+java
Considerations
1. Do you want to re-implement that logic in Java?
2. How would you test your implementation?
3. What would the ramifications of incorrect calculations?
@ctjava#r+java
R + Java = rJava
• rJava provides a Java API to R.
• JRI – ability to call from R back into Java code.
• Runs R inside of the JVM process via JNI.
• Single-threaded – R can be accessed ONLY by one thread!
• Native library can be loaded only ONCE.
@ctjava#r+java
<dependency>
<groupId>org.nuiton.thirdparty</groupId>
<artifactId>JRI</artifactId>
<version>0.9-6</version>
</dependency>
rJava and Maven
@ctjava#r+java
Configuring Project (non-Maven/SE)
Folder containing
JNI library
• Use R.home() to locate the installation
directory.
• rJava under library/rJava
@ctjava#r+java
Runtime Parameters
-DR_HOME
-Djava.library.path
-Denv.R_HOME
@ctjava#r+java
Starting R
• Interact with R via Rengine.
• Initialize Rengine with instance of RMainLoopCallbacks.
@ctjava#r+java
Simple rJava Example
@ctjava#r+java
Advanced rJava Example
@ctjava#r+java
R Scripts
Wait – I have to embed all of my R code in Java??
@ctjava#r+java
Java EE + R
JSR 352 -
Batching
@ctjava#r+java
Java EE Container Integration
• Add following libraries to container lib:
(glassfish4/glassfish/domains/<domain>/lib)
• JRI.java
• JRIEngine.jar
• Libjri.jnilib  native code!
• Rengine.jar
Do NOT include rJava dependencies in your WAR/EAR!
@ctjava#r+java
Java EE Container Integration
@ctjava#r+java
JSR 352 Basic Concepts
Job
Operator
Job Step
Job Repository
ItemReader
ItemProcesso
r
ItemWriter
Batchlet
@ctjava#r+java
JSR 353 Basic Concepts
• Job – encapsulates the entire batch process.
• JobInstance – actual execution of a job.
• JobParameters – parameters passed to a job.
• Step – encapsulates an independent, sequential phase of a batch job.
• Batch checkpoints:
• Bookmarking of progress so that a job can be restarted.
• Important for long running jobs
@ctjava#r+java
JSR 352 Basic Concepts
• Step Models:
• Chunk – comprised of Reader/Writer/Procesor
• Batchlet – task oriented step (file transfer etc.)
• Partitioning – mechanism for running steps in parallel
• Listeners – provide life-cycle hooks
@ctjava#r+java
Initializing R in Singleton Bean
@ctjava#r+java
Example: Road Race Statistics
@ctjava#r+java
Example Batch Job: 5k Racing
Process overview
• ResultRetrieverBatchlet – Downloads data raw data from website.
• RaceResultsReader – Extracts individual runners from the raw data.
• RaceResultsProcessor – Parses a runner’s results.
• RaceResultsWriter – Writes the statistics to the database.
• RaceAnalysisBatchlet – Uses R to analyze race results.
Notes:
• JAX-RS used to retrieve the results from the website.
• JPA to persist the results.
• R script extracts the results from PostgeSQL (not passed in)
@ctjava#r+java
Example Batch Job: 5k Racing
@ctjava#r+java
Example Batch Job: 5k Racing
@ctjava#r+java
Example Batch Job: 5k Racing
@ctjava#r+java
Example Batch Job: 5k Racing
@ctjava#r+java
Challeges
• R can be memory hog!
• Crashes takes down R + Java + Container!
• Solution: R scripts ‘externally’
• Note: plotting requires X!
@YourTwitterHandle#DVXFR14{session hashtag} @ctjava#r+java
@YourTwitterHandle#DVXFR14{session hashtag} @ctjava#r+java
Questions
@YourTwitterHandle#DVXFR14{session hashtag} @ctjava#r+java
rcuprak@gmail.com (Java)
actuary.elsa@gmail.com
(Stats)
@ctjava

More Related Content

PPTX
Amazon ECS at Coursera: A unified execution framework while defending against...
PPTX
Docker & ECS: Secure Nearline Execution
PDF
Play Framework: Intro & High-Level Overview
PDF
Play Framework + Docker + CircleCI + AWS + EC2 Container Service
PDF
Iguazú: A Long-Running Job Scheduler using Docker and Mesos
PDF
Continuous Integration and Deployment Best Practices on AWS
PDF
Serverless in java Lessons learnt
PDF
Rock Solid Deployment of Web Applications
Amazon ECS at Coursera: A unified execution framework while defending against...
Docker & ECS: Secure Nearline Execution
Play Framework: Intro & High-Level Overview
Play Framework + Docker + CircleCI + AWS + EC2 Container Service
Iguazú: A Long-Running Job Scheduler using Docker and Mesos
Continuous Integration and Deployment Best Practices on AWS
Serverless in java Lessons learnt
Rock Solid Deployment of Web Applications

What's hot (12)

PDF
New AWS Services
PDF
Serverless in Java Lessons learnt
PDF
DCSF19 Container Security: Theory & Practice at Netflix
PPTX
Sas 2015 event_driven
PDF
Greach 2014 - Road to Grails 3.0
PPT
Deploying and running Grails in the cloud
PDF
AWS Developer Fundamentals
PPT
Docker in the Cloud
PDF
Scala, ECS, Docker: Delayed Execution @Coursera
PPTX
Containerless in the Cloud with AWS Lambda
PPT
DevOpsCon Cloud Workshop
PDF
What's new with tooling for Spring, Grails, and the Cloud
New AWS Services
Serverless in Java Lessons learnt
DCSF19 Container Security: Theory & Practice at Netflix
Sas 2015 event_driven
Greach 2014 - Road to Grails 3.0
Deploying and running Grails in the cloud
AWS Developer Fundamentals
Docker in the Cloud
Scala, ECS, Docker: Delayed Execution @Coursera
Containerless in the Cloud with AWS Lambda
DevOpsCon Cloud Workshop
What's new with tooling for Spring, Grails, and the Cloud
Ad

Viewers also liked (20)

PDF
Data Science with R for Java Developers
PPT
Fieldwork 2015 data analysis stage
PPTX
Data analysis workflows part 1 2015
PPTX
JEEConf 2015 Big Data Analysis in Java World
PDF
PDF
PFE BI - INPT
PPTX
Faster Java EE Builds with Gradle
PDF
Creating API's with R and plumber
ODP
R-Users Group JSON and ReST Introduction using Twitter
PPTX
Vaklipi Text Analytics Tools
PPTX
Data analytics and analysis trends in 2015 - Webinar
PPTX
JavaOne 2013: Organizing Your Local Community
PPTX
Java script nirvana in netbeans [con5679]
PDF
Rstudio in aws 16 9
PDF
ffbase, statistical functions for large datasets
PPTX
Developing in the Cloud
PDF
Random forest using apache mahout
PPTX
JavaFX Versus HTML5 - JavaOne 2014
PDF
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
PPTX
Java EE 8 Update
Data Science with R for Java Developers
Fieldwork 2015 data analysis stage
Data analysis workflows part 1 2015
JEEConf 2015 Big Data Analysis in Java World
PFE BI - INPT
Faster Java EE Builds with Gradle
Creating API's with R and plumber
R-Users Group JSON and ReST Introduction using Twitter
Vaklipi Text Analytics Tools
Data analytics and analysis trends in 2015 - Webinar
JavaOne 2013: Organizing Your Local Community
Java script nirvana in netbeans [con5679]
Rstudio in aws 16 9
ffbase, statistical functions for large datasets
Developing in the Cloud
Random forest using apache mahout
JavaFX Versus HTML5 - JavaOne 2014
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Java EE 8 Update
Ad

Similar to Combining R With Java For Data Analysis (Devoxx UK 2015 Session) (20)

PPTX
Big data analytics with R tool.pptx
PDF
R - the language
PPTX
1_Introduction.pptx
PPTX
Introduction to R _IMPORTANT FOR DATA ANALYTICS
PPTX
A Workshop on R
PDF
Introduction to R
PPTX
R_Scripting_Basics_2022-03aaaaaaaaa.pptx
PPT
An introduction to R is a document useful
PDF
R-Language-Lab-Manual-lab-1.pdf
PDF
R-Language-Lab-Manual-lab-1.pdf
PDF
R-Language-Lab-Manual-lab-1.pdf
PDF
Unit1_Introduction to R.pdf
PPTX
Introduction to R - Basics of R programming, Data structures.pptx
PPTX
Workshop presentation hands on r programming
PPTX
R programming language
PDF
R tutorial
PPTX
R basics for MBA Students[1].pptx
PPTX
Data Science With R Programming Unit - II Part-1.pptx
PPTX
Data science with R Unit - II Part-1.pptx
PPTX
Big data analytics with R tool.pptx
R - the language
1_Introduction.pptx
Introduction to R _IMPORTANT FOR DATA ANALYTICS
A Workshop on R
Introduction to R
R_Scripting_Basics_2022-03aaaaaaaaa.pptx
An introduction to R is a document useful
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
Unit1_Introduction to R.pdf
Introduction to R - Basics of R programming, Data structures.pptx
Workshop presentation hands on r programming
R programming language
R tutorial
R basics for MBA Students[1].pptx
Data Science With R Programming Unit - II Part-1.pptx
Data science with R Unit - II Part-1.pptx

More from Ryan Cuprak (16)

PPTX
Jakarta EE Test Strategies (2022)
PPTX
DIY Home Weather Station (Devoxx Poland 2023)
PPTX
Why jakarta ee matters (ConFoo 2021)
PDF
Polygot Java EE on the GraalVM
PPTX
Exploring Java Heap Dumps (Oracle Code One 2018)
PPTX
Node.js Development with Apache NetBeans
PPTX
Preparing for java 9 modules upload
PPTX
Faster Java EE Builds with Gradle
PPTX
Java EE 8
PPTX
Batching and Java EE (jdk.io)
PPTX
Faster java ee builds with gradle [con4921]
PPTX
Jms deep dive [con4864]
PPTX
Top 50 java ee 7 best practices [con5669]
PPTX
Hybrid Mobile Development with Apache Cordova and
PPTX
50 EJB 3 Best Practices in 50 Minutes - JavaOne 2014
PPTX
Hybrid Mobile Development with Apache Cordova and Java EE 7 (JavaOne 2014)
Jakarta EE Test Strategies (2022)
DIY Home Weather Station (Devoxx Poland 2023)
Why jakarta ee matters (ConFoo 2021)
Polygot Java EE on the GraalVM
Exploring Java Heap Dumps (Oracle Code One 2018)
Node.js Development with Apache NetBeans
Preparing for java 9 modules upload
Faster Java EE Builds with Gradle
Java EE 8
Batching and Java EE (jdk.io)
Faster java ee builds with gradle [con4921]
Jms deep dive [con4864]
Top 50 java ee 7 best practices [con5669]
Hybrid Mobile Development with Apache Cordova and
50 EJB 3 Best Practices in 50 Minutes - JavaOne 2014
Hybrid Mobile Development with Apache Cordova and Java EE 7 (JavaOne 2014)

Recently uploaded (20)

PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
Transform Your Business with a Software ERP System
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
ai tools demonstartion for schools and inter college
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
medical staffing services at VALiNTRY
PPTX
Introduction to Artificial Intelligence
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Digital Strategies for Manufacturing Companies
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PTS Company Brochure 2025 (1).pdf.......
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Understanding Forklifts - TECH EHS Solution
Transform Your Business with a Software ERP System
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
ai tools demonstartion for schools and inter college
Upgrade and Innovation Strategies for SAP ERP Customers
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
How Creative Agencies Leverage Project Management Software.pdf
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Design an Analysis of Algorithms I-SECS-1021-03
medical staffing services at VALiNTRY
Introduction to Artificial Intelligence
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Digital Strategies for Manufacturing Companies
Reimagine Home Health with the Power of Agentic AI​
Wondershare Filmora 15 Crack With Activation Key [2025
Odoo Companies in India – Driving Business Transformation.pdf

Combining R With Java For Data Analysis (Devoxx UK 2015 Session)