Support Vector Machines (SVM)

FAO- Global Soil
Partnership
Training on
Digital Soil Organic Carbon
Mapping
20-24 January 2018
Tehran/Iran
Yusuf YIGINI, PhD - FAO, Land and Water Division (CBL)
Guillermo Federico Olmedo, PhD - FAO, Land and Water Division (CBL)

Support Vector Machines
SVM apply a simple linear method to the data but
in a high-dimensional feature space non-linearly
related to the input space. It creates a hyperplane
through n-dimensional spectral-space. Then, SVM
separates numerical data based on a kernel
function and parameters (e.g. gamma and cost)
that maximize the margin from the closest point to
the hyperplane that divides data with the largest
possible margin, being the support vectors the
points which fall within. Then, linear models are
fitted to the support vectors.

A major benefit of using SVR is that it is a non-
parametric technique. Unlike SLR, whose results
depend on Gauss-Markov assumptions, the output
model from SVR does not depend on distributions
of the underlying dependent and independent
variables. Instead the SVR technique depends on
kernel functions.
https://www.kdnuggets.com/2017/03/building-regression-models-support-vector-regression.html

Another advantage of SVR is that it permits for
construction of a non-linear model without
changing the explanatory variables, helping in
better interpretation of the resultant model. The
basic idea behind SVR is not to care about the
prediction as long as the error (ϵ) is less than
certain value.

https://www.kdnuggets.com/2017/03/building-
regression-models-support-vector-regression.html

Using R for Digital Soil Mapping - McBratney et al, 2016
dat <- read.csv("MKD_RegMatrix1.csv")
names(dat)
[1] "X.1" "id" "Y" "X" "SOC" "BLD"
"CRFVOL" "OCSKGM" "meaERROR" "B04CHE3"
[11] "B07CHE3" "B13CHE3" "B14CHE3" "DEMENV5" "LCEE10" "PRSCHE3"
"SLPMRG5" "TMDMOD3" "TMNMOD3" "TWIMRG5"
[21] "VBFMRG5" "VDPMRG5" "soilmap"
Import data > MKD_RegMatrix.csv

names(dat)
COR <- cor(as.matrix(dat[,7]), as.matrix(dat[,-c(1:8)]))
COR
x <- subset(melt(COR), value != 1 | value != NA)
x <- x[with(x, order(-abs(x$value))),]
x[1:25,]
idx <- as.character(x$X2[1:25])
Correlation analysis to select covariates

dat2 <- dat[c('OCSKGM', idx),]
names(dat2)
files <- list.files(path = "covs", pattern = "tif$",
+ full.names = TRUE)
COV <- stack(files)
COV <- COV[[idx]]
plot(COV)

tuneResult <- tune(svm, OCSKGM ~., data = dat[,c("OCSKGM",
names(COV))],
ranges = list(epsilon = seq(0,1,0.1),
cost = c(.5,1,1.5,2,5,10))
)
# Choose the model with the best combination of epsilon and cost
tunedModel <- tuneResult$best.model
Testing the different values for epsilon and cost!
C is a regularization parameter that controls the trade off
between the achieving a low training error and a low testing
error that is the ability to generalize your classifier to unseen
data.

tuneresult
Parameter tuning of ‘svm’:
- sampling method: 10-fold cross validation
- best parameters:
epsilon cost
0.4 5
- best performance: 0.7649055
Plot(tuneResult)
The best model is the one with lowest MSE. The darker the
region the lower the MSE, which means better the model. In our
sample data MSE is lowest at epsilon - 0 and cost – 7.

Plot(tuneResult)
Testing the different values for epsilon and cost!
C is a regularization parameter that controls the trade off
between the achieving a low training error and a low testing
error that is the ability to generalize your classifier to unseen
data.

SVM
# Use the model to predict the SOC in the covariates space
beginCluster()
start <- Sys.time()
pred <- clusterR(COV, predict, args=list(tunedModel))
print(Sys.time() - start)
endCluster()
plot(pred)

Support Vector Machines (SVM)

More Related Content

What's hot (20)

Similar to Support Vector Machines (SVM) (20)

More from FAO (20)

Recently uploaded (20)

Support Vector Machines (SVM)