SlideShare a Scribd company logo
DML	Syntax	&	Invocation
Nakul	Jindal
Spark	Technology	Center,	San	Francisco
Goal	of	These	Slides
• Provide	you	with	basic	DML	syntax
• Link	to	important	resources
• Invocation	
Non-Goals
• Comprehensive	syntax	and	API	coverage
Resources
• Google	“Apache	Systemml”
• Documentation	- https://apache.github.io/incubator-systemml/
• DML	Language	Reference	- https://apache.github.io/incubator-systemml/dml-
language-reference.html
• MLContext- https://apache.github.io/incubator-systemml/spark-mlcontext-
programming-guide.html#spark-shell-scala-example
• Github - https://github.com/apache/incubator-systemml
Note
• Some	documentation	 is	outdated
• If	you	find	a	typo	or	want	to	update	the	document,	consider	making	a	Pull	Request
• All	docs	are	in	Markdown	format
• https://github.com/apache/incubator-systemml/tree/master/docs
About	DML	Briefly	
• DML	=	Declarative	Machine	Learning
• R-like	syntax,	some	subtle	differences	from	R
• Dynamically	typed
• Data	Structures
• Scalars	– Boolean,	Integers,	Strings,	Double	Precision
• Cacheable	– Matrices,	DataFrames
• Data	Structure	Terminology	in	DML
• Value	Type	- Boolean,	Integers,	Strings,	Double	Precision
• Data	Type	– Scalar,	Matrices,	DataFrames*
• You	can	have	a	DataType[ValueType],	not	all	combinations	are	supported
• For	instance	– matrix[double]
• Scoping
• One	global	scope,	except	inside	functions
*	Coming	soon
About	DML	Briefly	
• Control	Flow
• Sequential	imperative	control	flow	(like	most	other	languages)
• Looping	–
• while (<condition>)	{	…	}
• for (var in <for_predicate>)	{	…	}
• parfor (var in <for_predicate>)	{	…	} //	Iterations	in	parallel
• Guards	–
• if (<condition>)	{	...	}	[ else if (<condition>)	{	...	}	...	else {	…	}	]
• Functions
• Built-in	– List	available	in	language	reference
• User	Defined	– (multiple	return	parameters)
• functionName =	function (<formal_parameters>…)	return (<formal_parameters>)	{	...	}
• Can	only	access	variables	defined	in	the	formal_parameters in	the	body	of	the	function	
• External	Function	– same	as	user	defined,	can	call	external	Java	Package
About	DML	Briefly
• Imports
• Can	import	user	defined/external	functions from	other	source	files
• Disambiguation	using	namespaces
• Command	Line	Arguments
• By	position	- $1,	$2 …
• By	name	- $X,	$Y ...
• Limitations
• A	user	defined	functions	can	only	be	called	on	the	right	hand	side	of	assignments	as	
the	only	expression
• Cannot	write
• X	<- Y	+	bar()
• for (i in foo(1,2,3))	{	…	}
Sample	Code
A = 1.0 # A is an integer
X <- matrix(“4 3 2 5 7 8”, rows=3, cols=2) # X = matrix of size 3,2 '<-' is assignment
Y = matrix(1, rows=3, cols=2) # Y = matrix of size 3,2 with all 1s
b <- t(X) %*% Y # %*% is matrix multiply, t(X) is transpose
S = "hello world"
i=0
while(i < max_iteration) {
H = (H * (t(W) %*% (V/(W%*%H))))/t(colSums(W)) # * is element by element mult
W = (W * ((V/(W%*%H)) %*% t(H)))/t(rowSums(H))
i = i + 1; # i is an integer
}
print (toString(H)) # toString converts a matrix to a string
Sample	Code
source("nn/layers/affine.dml") as affine # import a file in the “affine“ namespace
[W, b] = affine::init(D, M) # calls the init function, multiple return
parfor (i in 1:nrow(X)) { # i iterates over 1 through num rows in X in parallel
for (j in 1:ncol(X)) { # j iterates over 1 through num cols in X
# Computation ...
}
}
write (M, fileM, format=“text”) # M=matrix, fileM=file, also writes to HDFS
X = read (fileX) # fileX=file, also reads from HDFS
if (ncol (A) > 1) {
# Matrix A is being sliced by a given range of columns
A[,1:(ncol (A) - 1)] = A[,1:(ncol (A) - 1)] - A[,2:ncol (A)];
}
Sample	Code
interpSpline = function(
double x, matrix[double] X, matrix[double] Y, matrix[double] K) return (double q) {
i = as.integer(nrow(X) - sum(ppred(X, x, ">=")) + 1)
# misc computation …
q = as.scalar(qm)
}
eigen = externalFunction(Matrix[Double] A)
return(Matrix[Double] eval, Matrix[Double] evec)
implemented in (classname="org.apache.sysml.udf.lib.EigenWrapper", exectype="mem")
Sample	Code	(From	LinearRegDS.dml*)
A = t(X) %*% X
b = t(X) %*% y
if (intercept_status == 2) {
A = t(diag (scale_X) %*% A + shift_X %*% A [m_ext, ])
A = diag (scale_X) %*% A + shift_X %*% A [m_ext, ]
b = diag (scale_X) %*% b + shift_X %*% b [m_ext, ]
}
A = A + diag (lambda)
print ("Calling the Direct Solver...")
beta_unscaled = solve (A, b)
*https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/LinearRegDS.dml#L133
MLContext API
• You	can	invoke	SystemML from	the	
• Command	line	or	a	
• Spark	Program
• The	MLContext API	lets	you	invoke	it	from	a	Spark	Program
• Command	line	invocation	described	later
• Available	as	a	Scala	API	and	a	Python	API
• These	slides	will	only	talk	about	the	Scala	API
MLContext API	– Example	Usage
val ml = new MLContext(sc)
val X_train = sc.textFile("amazon0601.txt")
.filter(!_.startsWith("#"))
.map(_.split("t") match{case Array(prod1, prod2)=>(prod1.toInt, prod2.toInt,1.0)})
.toDF("prod_i", "prod_j", "x_ij")
.filter("prod_i < 5000 AND prod_j < 5000") // Change to smaller number
.cache()
MLContext API	– Example	Usage
val pnmf =
"""
# data & args
X = read($X)
rank = as.integer($rank)
# Computation ....
write(negloglik, $negloglikout)
write(W, $Wout)
write(H, $Hout)
"""
MLContext API	– Example	Usage
val pnmf =
"""
# data & args
X = read($X)
rank = as.integer($rank)
# Computation ....
write(negloglik, $negloglikout)
write(W, $Wout)
write(H, $Hout)
"""
ml.registerInput("X", X_train)
ml.registerOutput("W")
ml.registerOutput("H")
ml.registerOutput("negloglik")
val outputs = ml.executeScript(pnmf,
Map("maxiter" -> "100", "rank" -> "10"))
val negloglik = getScalarDouble(outputs,
"negloglik")
Invocation	– How	to	run	a	DML	file
• SystemML can	run	on
• Your	laptop	(Standalone)
• Spark
• Hybrid	Spark	– using	the	better	choice	between	the	driver	and	the	cluster
• Hadoop
• Hybrid	Hadoop	
• For	this	presentation,	we	care	about	standalone,	spark &	
hybrid_spark
• Documentation	has	detailed	instructions	on	the	others
Invocation	– How	to	run	a	DML	file
Standalone	
In	the	systemml directory
bin/systemml <dml-filename>	[arguments]
Example	invocations:
bin/systemml LinearRegCG.dml –nvargs X=X.mtx Y=Y.mtx B=B.mtx
bin/systemml oddsRatio.dml –args X.mtx 50	B.mtx
Named	arguments
Position	arguments
Invocation	– How	to	run	a	DML	file
Spark/ Hybrid	Spark	
Define	SPARK_HOME	to	point	to	your	Apache	Spark	Installation
Define	SYSTEMML_HOME	to	point	to	your	Apache	SystemML installation
In	the	systemml directory
scripts/sparkDML.sh<dml-filename>	[systemmlarguments]
Example	invocations:
scripts/sparkDML.sh LinearRegCG.dml --nvargs X=X.mtx Y=Y.mtxB=B.mtx
scripts/sparkDML.sh oddsRatio.dml --args X.mtx 50	B.mtx
Named	arguments
Position	arguments
Invocation	– How	to	run	a	DML	file
Spark/ Hybrid	Spark	
Define	SPARK_HOME	to	point	to	your	Apache	Spark	Installation
Define	SYSTEMML_HOME	to	point	to	your	Apache	SystemML installation
Using	the	spark-submit	script
$SPARK_HOME/bin/spark-submit
--master	<master-url>		
--class	org.apache.sysml.api.DMLScript
${SYSTEMML_HOME}/SystemML.jar -f	<dml-filename>	 <systemml arguments>	-exec	{hybrid_spark,spark}
Example	invocation:
$SPARK_HOME/bin/spark-submit	
--master	local[*]	
--class	org.apache.sysml.api.DMLScript
${SYSTEMML_HOME}/SystemML.jar -f	LinearRegCG.dml --nvargs X=X.mtx Y=Y.mtx B=B.mtx
Editor	Support
• Very	rudimentary	editor	support
• Bit	of	shameless	self-promotion	:	
• Atom	– Hackable	Text	editor
• Install	package	- https://atom.io/packages/language-dml
• From	GUI	- http://flight-manual.atom.io/using-atom/sections/atom-packages/
• Or	from	command	line	– apm install	language-dml
• Rudimentary	snippet	based	completion	of	builtin function
• Vim
• Install	package	- https://github.com/nakul02/vim-dml
• Works	with	Vundle(vim	package	manager)
• There	is	an	experimental	Zeppelin	Notebook	integration	with	DML	–
• https://issues.apache.org/jira/browse/SYSTEMML-542
• Available	as	a	docker image	to	play	with	- https://hub.docker.com/r/nakul02/incubator-zeppelin/
• Please	send	feedback	when	using	these,	requests	for	features,	bugs
• I’ll	work	on	them	when	I	can
Other	Information
• All	scripts	are	in	- https://github.com/apache/incubator-
systemml/tree/master/scripts
• Algorithm	Scripts	- https://github.com/apache/incubator-
systemml/tree/master/scripts/algorithms
• Test	Scripts	- https://github.com/apache/incubator-
systemml/tree/master/src/test/scripts
• Look	inside	the	test	folder	for	programs	that	run	the	tests,	play	
around	with	some	of	them	- https://github.com/apache/incubator-
systemml/tree/master/src/test/java/org/apache/sysml/test
Thanks!
• The	documentation	might	be	outdated	and	have	typos
• Please	submit	fixes
• If	a	language	feature	does	not	make	sense	or	is	missing,	ask	a	
SystemML team	member
• Have	Fun!
BACKUP	SLIDES
• There	was	an	attempt	at	an	Eclipse	Plugin	late	last	year	-
• https://www.mail-
archive.com/dev%40systemml.incubator.apache.org/msg00147.html
• The	project	is	largely	dead
Editor	Support

More Related Content

PDF
C# for Java Developers
PDF
Quick introduction to scala
PDF
Scala eXchange opening
PDF
Hey! There's OCaml in my Rust!
PPTX
More on Lex
PDF
Functional Programming in Scala: Notes
PPT
Oscon keynote: Working hard to keep it simple
PPTX
Bioinformatics v2014 wim_vancriekinge
C# for Java Developers
Quick introduction to scala
Scala eXchange opening
Hey! There's OCaml in my Rust!
More on Lex
Functional Programming in Scala: Notes
Oscon keynote: Working hard to keep it simple
Bioinformatics v2014 wim_vancriekinge

What's hot (18)

PDF
Programming in Scala: Notes
ODP
A Tour Of Scala
PDF
CNIT 127: Ch 2: Stack overflows on Linux
PPT
2CPP15 - Templates
PPTX
Advanced Functional Programming in Scala
PDF
Spark Schema For Free with David Szakallas
PDF
Advance Scala - Oleg Mürk
PDF
Introduction to programming in scala
PPTX
The Evolution of Scala
PDF
Spark workshop
PDF
Demystifying functional programming with Scala
PDF
Functional programming in Scala
PDF
Pune Clojure Course Outline
PDF
Chapter 10 Library Function
PDF
Python Programming - IX. On Randomness
PDF
Functional Programming in Scala
PDF
ODP
Functional Programming With Scala
Programming in Scala: Notes
A Tour Of Scala
CNIT 127: Ch 2: Stack overflows on Linux
2CPP15 - Templates
Advanced Functional Programming in Scala
Spark Schema For Free with David Szakallas
Advance Scala - Oleg Mürk
Introduction to programming in scala
The Evolution of Scala
Spark workshop
Demystifying functional programming with Scala
Functional programming in Scala
Pune Clojure Course Outline
Chapter 10 Library Function
Python Programming - IX. On Randomness
Functional Programming in Scala
Functional Programming With Scala
Ad

Viewers also liked (20)

PDF
Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
PDF
Regression using Apache SystemML by Alexandre V Evfimievski
PDF
Data preparation, training and validation using SystemML by Faraz Makari Mans...
PDF
Amia tb-review-11
PDF
Inside Apache SystemML by Frederick Reiss
PDF
Building Custom Machine Learning Algorithms With Apache SystemML
PPTX
Inside Apache SystemML
PDF
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
PDF
Clustering and Factorization using Apache SystemML by Prithviraj Sen
DOCX
Resume sachin kuckian
PDF
Classification using Apache SystemML by Prithviraj Sen
PDF
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
PDF
Apache SystemML Architecture by Niketan Panesar
PPTX
Equilibrium – puttingdemandandsupplytogether
PPTX
Parallel Machine Learning- DSGD and SystemML
PPTX
Building Custom
Machine Learning Algorithms
with Apache SystemML
PDF
南投縣發祥國小辦理教育優先區計畫實施情形考核表
PDF
Spark Summit EU talk by Heiko Korndorf
PDF
Innovative & Groundbreaking Automotive Startups
PPT
На уроках географії
Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
Regression using Apache SystemML by Alexandre V Evfimievski
Data preparation, training and validation using SystemML by Faraz Makari Mans...
Amia tb-review-11
Inside Apache SystemML by Frederick Reiss
Building Custom Machine Learning Algorithms With Apache SystemML
Inside Apache SystemML
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Clustering and Factorization using Apache SystemML by Prithviraj Sen
Resume sachin kuckian
Classification using Apache SystemML by Prithviraj Sen
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
Apache SystemML Architecture by Niketan Panesar
Equilibrium – puttingdemandandsupplytogether
Parallel Machine Learning- DSGD and SystemML
Building Custom
Machine Learning Algorithms
with Apache SystemML
南投縣發祥國小辦理教育優先區計畫實施情形考核表
Spark Summit EU talk by Heiko Korndorf
Innovative & Groundbreaking Automotive Startups
На уроках географії
Ad

Similar to S1 DML Syntax and Invocation (20)

PDF
Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
PDF
SystemML - Datapalooza Denver - 05.17.16 MWD
PDF
Bringing Algebraic Semantics to Mahout
PDF
MLlib sparkmeetup_8_6_13_final_reduced
PPTX
2018 03 25 system ml ai and openpower meetup
PDF
Machine Learning using Apache Spark MLlib
PPTX
Mahout scala and spark bindings
PPTX
System mldl meetup
PDF
MLlib: Spark's Machine Learning Library
PDF
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
PDF
What's new in Apache SystemML - Declarative Machine Learning
PDF
Deep learning with kafka
PDF
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
PDF
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
PDF
Alpine Tech Talk: System ML by Berthold Reinwald
PDF
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017
PPTX
System mldl meetup
PDF
SystemML - Declarative Machine Learning
PPTX
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
PPTX
A Scaleable Implemenation of Deep Leaning on Spark- Alexander Ulanov
Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
SystemML - Datapalooza Denver - 05.17.16 MWD
Bringing Algebraic Semantics to Mahout
MLlib sparkmeetup_8_6_13_final_reduced
2018 03 25 system ml ai and openpower meetup
Machine Learning using Apache Spark MLlib
Mahout scala and spark bindings
System mldl meetup
MLlib: Spark's Machine Learning Library
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
What's new in Apache SystemML - Declarative Machine Learning
Deep learning with kafka
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
Alpine Tech Talk: System ML by Berthold Reinwald
Deep Learning in Spark with BigDL by Petar Zecevic at Big Data Spain 2017
System mldl meetup
SystemML - Declarative Machine Learning
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implemenation of Deep Leaning on Spark- Alexander Ulanov

More from Arvind Surve (12)

PDF
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
PDF
Apache SystemML Architecture by Niketan Panesar
PDF
Clustering and Factorization using Apache SystemML by Prithviraj Sen
PDF
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
PDF
Classification using Apache SystemML by Prithviraj Sen
PDF
Data preparation, training and validation using SystemML by Faraz Makari Mans...
PDF
DML Syntax and Invocation process
PDF
Apache SystemML 2016 Summer class primer by Berthold Reinwald
PDF
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
PDF
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
PDF
Regression using Apache SystemML by Alexandre V Evfimievski
PDF
Apache SystemML 2016 Summer class primer by Berthold Reinwald
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Architecture by Niketan Panesar
Clustering and Factorization using Apache SystemML by Prithviraj Sen
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
Classification using Apache SystemML by Prithviraj Sen
Data preparation, training and validation using SystemML by Faraz Makari Mans...
DML Syntax and Invocation process
Apache SystemML 2016 Summer class primer by Berthold Reinwald
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
Regression using Apache SystemML by Alexandre V Evfimievski
Apache SystemML 2016 Summer class primer by Berthold Reinwald

Recently uploaded (20)

PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Pharma ospi slides which help in ospi learning
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Complications of Minimal Access Surgery at WLH
PPTX
Cell Structure & Organelles in detailed.
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
Cell Types and Its function , kingdom of life
PPTX
Institutional Correction lecture only . . .
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Pharma ospi slides which help in ospi learning
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPH.pptx obstetrics and gynecology in nursing
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Complications of Minimal Access Surgery at WLH
Cell Structure & Organelles in detailed.
STATICS OF THE RIGID BODIES Hibbelers.pdf
Microbial disease of the cardiovascular and lymphatic systems
Week 4 Term 3 Study Techniques revisited.pptx
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Cell Types and Its function , kingdom of life
Institutional Correction lecture only . . .
Anesthesia in Laparoscopic Surgery in India
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Pharmacology of Heart Failure /Pharmacotherapy of CHF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx

S1 DML Syntax and Invocation

  • 2. Goal of These Slides • Provide you with basic DML syntax • Link to important resources • Invocation Non-Goals • Comprehensive syntax and API coverage
  • 3. Resources • Google “Apache Systemml” • Documentation - https://apache.github.io/incubator-systemml/ • DML Language Reference - https://apache.github.io/incubator-systemml/dml- language-reference.html • MLContext- https://apache.github.io/incubator-systemml/spark-mlcontext- programming-guide.html#spark-shell-scala-example • Github - https://github.com/apache/incubator-systemml Note • Some documentation is outdated • If you find a typo or want to update the document, consider making a Pull Request • All docs are in Markdown format • https://github.com/apache/incubator-systemml/tree/master/docs
  • 4. About DML Briefly • DML = Declarative Machine Learning • R-like syntax, some subtle differences from R • Dynamically typed • Data Structures • Scalars – Boolean, Integers, Strings, Double Precision • Cacheable – Matrices, DataFrames • Data Structure Terminology in DML • Value Type - Boolean, Integers, Strings, Double Precision • Data Type – Scalar, Matrices, DataFrames* • You can have a DataType[ValueType], not all combinations are supported • For instance – matrix[double] • Scoping • One global scope, except inside functions * Coming soon
  • 5. About DML Briefly • Control Flow • Sequential imperative control flow (like most other languages) • Looping – • while (<condition>) { … } • for (var in <for_predicate>) { … } • parfor (var in <for_predicate>) { … } // Iterations in parallel • Guards – • if (<condition>) { ... } [ else if (<condition>) { ... } ... else { … } ] • Functions • Built-in – List available in language reference • User Defined – (multiple return parameters) • functionName = function (<formal_parameters>…) return (<formal_parameters>) { ... } • Can only access variables defined in the formal_parameters in the body of the function • External Function – same as user defined, can call external Java Package
  • 6. About DML Briefly • Imports • Can import user defined/external functions from other source files • Disambiguation using namespaces • Command Line Arguments • By position - $1, $2 … • By name - $X, $Y ... • Limitations • A user defined functions can only be called on the right hand side of assignments as the only expression • Cannot write • X <- Y + bar() • for (i in foo(1,2,3)) { … }
  • 7. Sample Code A = 1.0 # A is an integer X <- matrix(“4 3 2 5 7 8”, rows=3, cols=2) # X = matrix of size 3,2 '<-' is assignment Y = matrix(1, rows=3, cols=2) # Y = matrix of size 3,2 with all 1s b <- t(X) %*% Y # %*% is matrix multiply, t(X) is transpose S = "hello world" i=0 while(i < max_iteration) { H = (H * (t(W) %*% (V/(W%*%H))))/t(colSums(W)) # * is element by element mult W = (W * ((V/(W%*%H)) %*% t(H)))/t(rowSums(H)) i = i + 1; # i is an integer } print (toString(H)) # toString converts a matrix to a string
  • 8. Sample Code source("nn/layers/affine.dml") as affine # import a file in the “affine“ namespace [W, b] = affine::init(D, M) # calls the init function, multiple return parfor (i in 1:nrow(X)) { # i iterates over 1 through num rows in X in parallel for (j in 1:ncol(X)) { # j iterates over 1 through num cols in X # Computation ... } } write (M, fileM, format=“text”) # M=matrix, fileM=file, also writes to HDFS X = read (fileX) # fileX=file, also reads from HDFS if (ncol (A) > 1) { # Matrix A is being sliced by a given range of columns A[,1:(ncol (A) - 1)] = A[,1:(ncol (A) - 1)] - A[,2:ncol (A)]; }
  • 9. Sample Code interpSpline = function( double x, matrix[double] X, matrix[double] Y, matrix[double] K) return (double q) { i = as.integer(nrow(X) - sum(ppred(X, x, ">=")) + 1) # misc computation … q = as.scalar(qm) } eigen = externalFunction(Matrix[Double] A) return(Matrix[Double] eval, Matrix[Double] evec) implemented in (classname="org.apache.sysml.udf.lib.EigenWrapper", exectype="mem")
  • 10. Sample Code (From LinearRegDS.dml*) A = t(X) %*% X b = t(X) %*% y if (intercept_status == 2) { A = t(diag (scale_X) %*% A + shift_X %*% A [m_ext, ]) A = diag (scale_X) %*% A + shift_X %*% A [m_ext, ] b = diag (scale_X) %*% b + shift_X %*% b [m_ext, ] } A = A + diag (lambda) print ("Calling the Direct Solver...") beta_unscaled = solve (A, b) *https://github.com/apache/incubator-systemml/blob/master/scripts/algorithms/LinearRegDS.dml#L133
  • 11. MLContext API • You can invoke SystemML from the • Command line or a • Spark Program • The MLContext API lets you invoke it from a Spark Program • Command line invocation described later • Available as a Scala API and a Python API • These slides will only talk about the Scala API
  • 12. MLContext API – Example Usage val ml = new MLContext(sc) val X_train = sc.textFile("amazon0601.txt") .filter(!_.startsWith("#")) .map(_.split("t") match{case Array(prod1, prod2)=>(prod1.toInt, prod2.toInt,1.0)}) .toDF("prod_i", "prod_j", "x_ij") .filter("prod_i < 5000 AND prod_j < 5000") // Change to smaller number .cache()
  • 13. MLContext API – Example Usage val pnmf = """ # data & args X = read($X) rank = as.integer($rank) # Computation .... write(negloglik, $negloglikout) write(W, $Wout) write(H, $Hout) """
  • 14. MLContext API – Example Usage val pnmf = """ # data & args X = read($X) rank = as.integer($rank) # Computation .... write(negloglik, $negloglikout) write(W, $Wout) write(H, $Hout) """ ml.registerInput("X", X_train) ml.registerOutput("W") ml.registerOutput("H") ml.registerOutput("negloglik") val outputs = ml.executeScript(pnmf, Map("maxiter" -> "100", "rank" -> "10")) val negloglik = getScalarDouble(outputs, "negloglik")
  • 15. Invocation – How to run a DML file • SystemML can run on • Your laptop (Standalone) • Spark • Hybrid Spark – using the better choice between the driver and the cluster • Hadoop • Hybrid Hadoop • For this presentation, we care about standalone, spark & hybrid_spark • Documentation has detailed instructions on the others
  • 16. Invocation – How to run a DML file Standalone In the systemml directory bin/systemml <dml-filename> [arguments] Example invocations: bin/systemml LinearRegCG.dml –nvargs X=X.mtx Y=Y.mtx B=B.mtx bin/systemml oddsRatio.dml –args X.mtx 50 B.mtx Named arguments Position arguments
  • 17. Invocation – How to run a DML file Spark/ Hybrid Spark Define SPARK_HOME to point to your Apache Spark Installation Define SYSTEMML_HOME to point to your Apache SystemML installation In the systemml directory scripts/sparkDML.sh<dml-filename> [systemmlarguments] Example invocations: scripts/sparkDML.sh LinearRegCG.dml --nvargs X=X.mtx Y=Y.mtxB=B.mtx scripts/sparkDML.sh oddsRatio.dml --args X.mtx 50 B.mtx Named arguments Position arguments
  • 18. Invocation – How to run a DML file Spark/ Hybrid Spark Define SPARK_HOME to point to your Apache Spark Installation Define SYSTEMML_HOME to point to your Apache SystemML installation Using the spark-submit script $SPARK_HOME/bin/spark-submit --master <master-url> --class org.apache.sysml.api.DMLScript ${SYSTEMML_HOME}/SystemML.jar -f <dml-filename> <systemml arguments> -exec {hybrid_spark,spark} Example invocation: $SPARK_HOME/bin/spark-submit --master local[*] --class org.apache.sysml.api.DMLScript ${SYSTEMML_HOME}/SystemML.jar -f LinearRegCG.dml --nvargs X=X.mtx Y=Y.mtx B=B.mtx
  • 19. Editor Support • Very rudimentary editor support • Bit of shameless self-promotion : • Atom – Hackable Text editor • Install package - https://atom.io/packages/language-dml • From GUI - http://flight-manual.atom.io/using-atom/sections/atom-packages/ • Or from command line – apm install language-dml • Rudimentary snippet based completion of builtin function • Vim • Install package - https://github.com/nakul02/vim-dml • Works with Vundle(vim package manager) • There is an experimental Zeppelin Notebook integration with DML – • https://issues.apache.org/jira/browse/SYSTEMML-542 • Available as a docker image to play with - https://hub.docker.com/r/nakul02/incubator-zeppelin/ • Please send feedback when using these, requests for features, bugs • I’ll work on them when I can
  • 20. Other Information • All scripts are in - https://github.com/apache/incubator- systemml/tree/master/scripts • Algorithm Scripts - https://github.com/apache/incubator- systemml/tree/master/scripts/algorithms • Test Scripts - https://github.com/apache/incubator- systemml/tree/master/src/test/scripts • Look inside the test folder for programs that run the tests, play around with some of them - https://github.com/apache/incubator- systemml/tree/master/src/test/java/org/apache/sysml/test
  • 21. Thanks! • The documentation might be outdated and have typos • Please submit fixes • If a language feature does not make sense or is missing, ask a SystemML team member • Have Fun!