SlideShare a Scribd company logo
Semi-random model tree ensembles: an effective
                and scalable regression method

                                       Bernhard Pfahringer
                                 Department of Computer Science
                                University of Waikato, New Zealand



                                           September 22nd , 2011




                                                                                            September 22nd , method 1 / 28
Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011
                                           Science University tree ensembles: an effective
Background


  Outline



   1     Background


   2     Algorithm


   3     Results


   4     Summary




                                                                                            September 22nd , method 2 / 28
Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011
                                           Science University tree ensembles: an effective
Background


  Local regression




           non-linear functions can be approximated by a set of locally linear
           estimators
           Regression and model trees are fast multi-variate versions of local
           regression




                                                                                            September 22nd , method 3 / 28
Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011
                                           Science University tree ensembles: an effective
Background


  Piece-wise linear approximation example




                                                                                            September 22nd , method 4 / 28
Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011
                                           Science University tree ensembles: an effective
Background


  Sample Regression Tree: constants in the leaves




         A159 <= −0.62 :
           A149 <= 0.52 : Y = 1.6977
           A149 > 0.52 : Y = 1.2213
         A159 > −0.62 :
           A149 <= 0.638 :
               A57 <= −0.485 : Y = 0.8388
               A57 > −0.485 : Y = 1.0569
           A149 > 0.638 : Y = 0.6062




                                                                                            September 22nd , method 5 / 28
Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011
                                           Science University tree ensembles: an effective
Background


  Sample Model Tree: linear models in the leaves


         A159 <= −0.62 :
            A149 <= 0.52 : LM1
            A149 > 0.52 : LM2
         A159 > −0.62 :
            A149 <= 0.638 : LM3
            A149 > 0.638 : LM4

         LM1 Y        = −0.597 ∗ A149 − 0.211 ∗ A159 + 1.901
         LM2 Y        = −0.471 ∗ A149 − 0.211 ∗ A159 + 1.353
         LM3 Y        = −0.365 ∗ A149 − 0.232 ∗ A159 + 1.017
         LM4 Y        = −0.555 ∗ A149 − 0.232 ∗ A159 + 0.776



                                                                                            September 22nd , method 6 / 28
Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011
                                           Science University tree ensembles: an effective
Algorithm


  Outline



   1     Background


   2     Algorithm


   3     Results


   4     Summary




                                                                                            September 22nd , method 7 / 28
Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011
                                           Science University tree ensembles: an effective
Algorithm


  Ensembles of Semi-Random Model Trees




           Ensembles usually improve results
           Most ensembles use randomization to generate diversity
           2 sources of randomness:
                  For each tree: divide data into a train and a validation set
                  To split: select best attribute from a random subset of all attributes




                                                                                            September 22nd , method 8 / 28
Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011
                                           Science University tree ensembles: an effective
Algorithm


  Single Semi-Random Model Tree




           Only consider median as split value (=> balanced trees)
           Leaf model: linear ridge regression model
           Cap model predictions inside observed extremes
           Optimise tree depth and ridge value using the validation set




                                                                                            September 22nd , method 9 / 28
Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011
                                           Science University tree ensembles: an effective
Algorithm


  Build ensemble




   BUILD E NSEMBLE (data, numTrees, k )

    1 for i = 1 to numTrees
    2       do randomly split data into two:
    3          train + validate
    4          BUILD T REE (train, validate, k)




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 10 / 28
Algorithm


  BuildTree

   BUILD T REE (train, validate, k)

    1      min ← MIN TARGET VALUE(train)
    2      max ← MAX TARGET VALUE(train)
    3      localSSE ← LIN R EG(train, validate)
    4      £
    5      if |train| > 10 & |validate| > 10
    6            do split ← RANDOM S PLIT(train, k )
    7                £
    8                 smT ← SMALLER(train, split)
    9                 smV ← SMALLER(validate, split)
   10                 smaller ← BUILD T REE(smT , smV , k )
   11                £
   12                 laT ← LARGER(train, split)
   13                 laV ← LARGER(validate, split)
   14                 larger ← BUILD T REE(laT , laV , k )
Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 11 / 28
Algorithm


  BuildTree, continued




   15 subSSE ← SSE(smaller , larger , validate)
   16 £
   17 if localSSE < subSSE
   18       do smaller ← null
   19          larger ← null
   20     else
   21          localModel ← null




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 12 / 28
Algorithm


  Ridge regression



   LIN R EG (train, validate)

    1    for ridge in 10−8 , 10−4 , 10−2 , 10−1 , 1, 10
    2         do modelr ← RIDGE R EGRESS(train, ridge)
    3             sser ← SSE(modelr , validate)
    4    if bestModel == model10
    5         do build models for ridge = 102 , 103 , ...
    6             and so on while improving
    7    localModel ← bestModel
    8    return minimum-sse-on-validation-data




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 13 / 28
Algorithm


  Random split selection




   RANDOM S PLIT (train, k)

    1 for i = 1 to k
    2       do splitAttr ← RANDOM CHOICE(allAttrs)
    3          stump ← STUMP(APPROX MEDIAN(splitAttr ))
    4          compute SSE(stump, train)
    5 return minimum-sse stump




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 14 / 28
Algorithm


  Parameter Settings


   reported experiments:
           average predictions of 50 randomized model trees
           to split select best of 50% randomly selected attributes
   generally: should optimise separately for every application, e.g. using
   cross-validation
           number of trees: “the more the merrier”, but diminishing returns
           number of randomly selected attributes: 50% is a good default, but
           may be depend on the total number and on sparseness




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 15 / 28
Results


  Outline



    1    Background


    2    Algorithm


    3    Results


    4    Summary




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 16 / 28
Results


  Comparison


           use more than 20 Torgo/UCI datasets, > 900 examples
                            2                 1
           repeated         3   training,     3   testing splits
           training split into equal build and validation halves ( 3 , 1 )
                                                                   1
                                                                       3
           preprocessed for missing or categorical values
           compare to:
                   LR: linear ridge regression, optimise ridge value
                   GP: gaussian process regression, optimise noise level and RBF
                   gamma
                   AG: additive groves, use ”fast” script
           use RMAE: relative mean absolute error



Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 17 / 28
Results


  RMAE on Torgo/UCI


                                                      RMAE for Torgo/UCI data


                   100


                    90            RMT

                                  GP
                    80
                                  LR
                    70
                                  AG

                    60


                    50


                    40


                    30


                    20


                    10


                        0




                                                                                          e
                          om re




                                                     el s




                                                 le es



                                                 cp ng




                                                              e
                                 M




                                                   ev o




                                                 u_ ct
                                                    ile d

                                          lta pl s




                                                                                          H
                                  k

                                                             v




                                                              s




                                                  nk ll
                                                              s




                                                 ab nh




                                                                                          L



                                                                                  us nm



                                                                                        2H
                               am




                           nk s
                                  t




                                                                                   us ol




                                                                                   m H
                               ou




                                                            n




                                        de 2d n
                                                         or




                                               l_ tor
                        ba nt




                                              ba ma




                                                                                       ak
                                                                                 pu e_8
                                                         on
                                                         m
                               oc




                                                            n



                                                _a rie




                                                                                        N



                                                                                 pu 16
                             8F




                                              cp _a




                                                                                 ho p
                       rm tu




                                              _e an
                                                       ro




                                                       ro
                                                        ni




                                                          i




                                                      32




                                                                                      a3
                                                                                     a8
                                                      us
                                                      at
                             e
                            gr




                                                                                ho in8
                      co lay




                            st




                                                                                     qu
                                                      al
                                                       f




                                            ca va
                     lo ex




                                                                                    e_
                                                      s
                                                      u
                                                     le
                          to




                                                 ho




                                                                                   m
                                                 ai




                                                                                   k
                   co ct
                   is




                                                el
                        o
                 rh




                                            lta
               lo




                                         de
             co




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 18 / 28
Results


  Build times on Torgo/UCI


                                             Training time in seconds for Torgo/UCI data

                100000

                                     RMT

                 10000               GP

                                     LR

                                     AG
                  1000




                  100




                    10




                     1




                   0.1
                                                                            s




                                                                           es
                                 ab ke



                                  nk s




                                          ns
                                   ile e




                                                                           ed
                               ba 2nh




                                 ai rs




                                                                  us ing



                                                                 2d _8L




                                                                                                 oc nts

                                                                                                            e
                                           k




                                                                                           co isto ut




                                                                                                           no
                                 u_ t




                                                                ho 16H
                                 cp M



                                  ki ll
                              pu 8nm

                         de um 2H

                                 le H




                                                                                                rm am
                                                                   ev ol




                                                                                                            v
                              cp _ac




                                                                         or
                                          a
                              b a ro n
                     oc



                                        on




                                                                                                         ur
                                                                                                         m
                              _e N




                                                                         p
                                        to




                                                                                                         o
                                      8F




                                                                       an
                                     sm
                                        a




                                       ro




                                                                       fri




                                                                                                        ni
                                     a3




                                                                     us




                                                                                             co me
                                                                      at




                                                                                            lo lay

                                                                                                      gr



                                                                                                      xt
                           lta a8
                   st
                         qu

                                     al



                                      3




                                    va




                                                                      e
                                                                    e_




                                                                                                     el
                                     u




                                    le
                                     n
                                  nk




                                                                    pl
                                                                   us




                                                                                                   te
                                                                    ho
                                  m




                                                                                                   o
                               _a




                                                               el
                                                                    l_




                                                                                              rh
                              p




                                                               ho
                               lta




                                                                 ca




                                                                                             lo
                              de




                                                                                           co




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 19 / 28
Results


  UCI Census dataset




   Table: Partial results, 2458285 examples in total, therefore about 800000 in
   the training fold.

                         Method           RMAE                         Time (secs)
                         LR                15.96                             1205
                         RMT                9.78                            19811
                         GP                    ?         ? (would need 5 Tb RAM)
                         AG                    ?            ? (estimated 2000000)




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 20 / 28
Results


  Near infrared (NIR) Datasets




   proprietary NIR data
           7 datasets
           from 255 upto 7500 spectra
           between 170 and 500odd features
           preprocessed for noise and base line shift




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 21 / 28
Results


  Sample NIR spectrum


                                              Prepocessed sample spectrum (nitrogen in soil)


             4




             3




             2




             1




             0
                  1   8   15   22   29   36   43   50   57   64   71   78   85   92   99 106 113 120 127 134 141 148 155 162 169



             -1




             -2




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 22 / 28
Results


  RMAE on NIR data


                                                       RMAE for NIR datasets

             90


                             RMT
             80
                             GP

             70              LR

                             AG
             60



             50



             40



             30



             20



             10
                     n        omd      rmd        tc         phe       ph      p5        na        g5




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 23 / 28
Results


  Build times on NIR data


                                                Training time in seconds for NIR data

                100000




                 10000




                  1000



                                                                                                           RMT
                                                                                                           GP
                  100
                                                                                                           LR
                                                                                                           AG



                    10




                     1
                         omd      rmd      na        n             tc      ph           phe   p5   g5




                   0.1

Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 24 / 28
Results


  Random Model Tree Build Times discussion




           complexity is O(K ∗ N ∗ logN + K 2 ∗ N)
           second term (linear model computation) seems to dominate
           therefore observed complexity ∼ O(K 2 ∗ N)




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 25 / 28
Summary


  Outline



    1    Background


    2    Algorithm


    3    Results


    4    Summary




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 26 / 28
Summary


  Conclusions




           Semi-Random Model Trees perform well
           They are fast: build time is practically linear in N
           Can model non-linear relationships




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 27 / 28
Summary


  Future Work




           Improve efficiency for large K
           Study more and different regression problems
           More comparisons to alternative regression schemes
           Streaming/Moa variant




Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011
                                           Science University tree ensembles: an effective scalable regression method 28 / 28

More Related Content

PDF
Estimating Space-Time Covariance from Finite Sample Sets
PDF
Polynomial Matrix Decompositions
PPT
. An introduction to machine learning and probabilistic ...
PPSX
Prototype-based models in machine learning
PDF
Graph Neural Network for Phenotype Prediction
PDF
Convolutional networks and graph networks through kernels
PDF
Kernel Methods and Relational Learning in Computational Biology
KEY
Computer Vision, Computation, and Geometry
Estimating Space-Time Covariance from Finite Sample Sets
Polynomial Matrix Decompositions
. An introduction to machine learning and probabilistic ...
Prototype-based models in machine learning
Graph Neural Network for Phenotype Prediction
Convolutional networks and graph networks through kernels
Kernel Methods and Relational Learning in Computational Biology
Computer Vision, Computation, and Geometry

What's hot (10)

PDF
Explanable models for time series with random forest
PDF
Lecture7 xing fei-fei
PDF
Getting started with chemometric classification
PDF
A short introduction to statistical learning
PDF
Ijciet 10 01_153-2
PDF
'ACCOST' for differential HiC analysis
PPTX
The Advancement and Challenges in Computational Physics - Phdassistance
PPTX
Pyramid Vector Quantization
PDF
Kernel methods for data integration in systems biology
PDF
Reproducibility and differential analysis with selfish
Explanable models for time series with random forest
Lecture7 xing fei-fei
Getting started with chemometric classification
A short introduction to statistical learning
Ijciet 10 01_153-2
'ACCOST' for differential HiC analysis
The Advancement and Challenges in Computational Physics - Phdassistance
Pyramid Vector Quantization
Kernel methods for data integration in systems biology
Reproducibility and differential analysis with selfish
Ad

Viewers also liked (7)

PDF
A discussion on sampling graphs to approximate network classification functions
PDF
Spectral Learning Methods for Finite State Machines with Applications to Na...
PDF
A query language for analyzing networks
PDF
Experiments with Randomisation and Boosting for Multi-instance Classification
PDF
Overlapping correlation clustering
PDF
Machine Learning Application Development
PDF
Distributed clustering from data streams
A discussion on sampling graphs to approximate network classification functions
Spectral Learning Methods for Finite State Machines with Applications to Na...
A query language for analyzing networks
Experiments with Randomisation and Boosting for Multi-instance Classification
Overlapping correlation clustering
Machine Learning Application Development
Distributed clustering from data streams
Ad

Similar to Semi-random model tree ensembles: an effective and scalable regression method (20)

PPTX
TreeNet Overview - Updated October 2012
PPTX
Random Forest Classifier in Machine Learning | Palin Analytics
PPTX
Introduction to random forest and gradient boosting methods a lecture
PPTX
artifial intelligence notes of islamia university
PDF
Building Random Forest at Scale
PPTX
Random Forest Decision Tree.pptx
PPT
Global Modeling of Biodiversity and Climate Change
PPTX
random forest.pptx
PPT
Tree net and_randomforests_2009
PDF
Conistency of random forests
PDF
lec8_annotated.pdf ml csci 567 vatsal sharan
PDF
4_2_Ensemble models and gradient boosting2.pdf
PDF
Aaa ped-15-Ensemble Learning: Random Forests
PPTX
Random Forest classifier in Machine Learning
PPTX
RandomForests_Sayed-tree based model.pptx
PPT
RandomForestsRandomForestsRandomForests.ppt
PPT
RandomForests Bootstrapping BAgging Aggregation
PPT
RandomForests in artificial intelligence
PDF
Weather Prediction Model using Random Forest Algorithm and Apache Spark
PDF
M3R.FINAL
TreeNet Overview - Updated October 2012
Random Forest Classifier in Machine Learning | Palin Analytics
Introduction to random forest and gradient boosting methods a lecture
artifial intelligence notes of islamia university
Building Random Forest at Scale
Random Forest Decision Tree.pptx
Global Modeling of Biodiversity and Climate Change
random forest.pptx
Tree net and_randomforests_2009
Conistency of random forests
lec8_annotated.pdf ml csci 567 vatsal sharan
4_2_Ensemble models and gradient boosting2.pdf
Aaa ped-15-Ensemble Learning: Random Forests
Random Forest classifier in Machine Learning
RandomForests_Sayed-tree based model.pptx
RandomForestsRandomForestsRandomForests.ppt
RandomForests Bootstrapping BAgging Aggregation
RandomForests in artificial intelligence
Weather Prediction Model using Random Forest Algorithm and Apache Spark
M3R.FINAL

Recently uploaded (20)

PDF
Elevate Cleaning Efficiency Using Tallfly Hair Remover Roller Factory Expertise
PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
PPT
Data mining for business intelligence ch04 sharda
PDF
A Brief Introduction About Julia Allison
DOCX
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
PDF
Training And Development of Employee .pdf
PDF
Roadmap Map-digital Banking feature MB,IB,AB
PDF
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
PPTX
Probability Distribution, binomial distribution, poisson distribution
PDF
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
PDF
How to Get Funding for Your Trucking Business
PPTX
Principles of Marketing, Industrial, Consumers,
PDF
Laughter Yoga Basic Learning Workshop Manual
PPTX
New Microsoft PowerPoint Presentation - Copy.pptx
PDF
Nidhal Samdaie CV - International Business Consultant
PDF
IFRS Notes in your pocket for study all the time
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
PPTX
5 Stages of group development guide.pptx
PPTX
HR Introduction Slide (1).pptx on hr intro
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
Elevate Cleaning Efficiency Using Tallfly Hair Remover Roller Factory Expertise
ICG2025_ICG 6th steering committee 30-8-24.pptx
Data mining for business intelligence ch04 sharda
A Brief Introduction About Julia Allison
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
Training And Development of Employee .pdf
Roadmap Map-digital Banking feature MB,IB,AB
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
Probability Distribution, binomial distribution, poisson distribution
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
How to Get Funding for Your Trucking Business
Principles of Marketing, Industrial, Consumers,
Laughter Yoga Basic Learning Workshop Manual
New Microsoft PowerPoint Presentation - Copy.pptx
Nidhal Samdaie CV - International Business Consultant
IFRS Notes in your pocket for study all the time
unit 1 COST ACCOUNTING AND COST SHEET
5 Stages of group development guide.pptx
HR Introduction Slide (1).pptx on hr intro
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice

Semi-random model tree ensembles: an effective and scalable regression method

  • 1. Semi-random model tree ensembles: an effective and scalable regression method Bernhard Pfahringer Department of Computer Science University of Waikato, New Zealand September 22nd , 2011 September 22nd , method 1 / 28 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011 Science University tree ensembles: an effective
  • 2. Background Outline 1 Background 2 Algorithm 3 Results 4 Summary September 22nd , method 2 / 28 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011 Science University tree ensembles: an effective
  • 3. Background Local regression non-linear functions can be approximated by a set of locally linear estimators Regression and model trees are fast multi-variate versions of local regression September 22nd , method 3 / 28 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011 Science University tree ensembles: an effective
  • 4. Background Piece-wise linear approximation example September 22nd , method 4 / 28 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011 Science University tree ensembles: an effective
  • 5. Background Sample Regression Tree: constants in the leaves A159 <= −0.62 : A149 <= 0.52 : Y = 1.6977 A149 > 0.52 : Y = 1.2213 A159 > −0.62 : A149 <= 0.638 : A57 <= −0.485 : Y = 0.8388 A57 > −0.485 : Y = 1.0569 A149 > 0.638 : Y = 0.6062 September 22nd , method 5 / 28 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011 Science University tree ensembles: an effective
  • 6. Background Sample Model Tree: linear models in the leaves A159 <= −0.62 : A149 <= 0.52 : LM1 A149 > 0.52 : LM2 A159 > −0.62 : A149 <= 0.638 : LM3 A149 > 0.638 : LM4 LM1 Y = −0.597 ∗ A149 − 0.211 ∗ A159 + 1.901 LM2 Y = −0.471 ∗ A149 − 0.211 ∗ A159 + 1.353 LM3 Y = −0.365 ∗ A149 − 0.232 ∗ A159 + 1.017 LM4 Y = −0.555 ∗ A149 − 0.232 ∗ A159 + 0.776 September 22nd , method 6 / 28 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011 Science University tree ensembles: an effective
  • 7. Algorithm Outline 1 Background 2 Algorithm 3 Results 4 Summary September 22nd , method 7 / 28 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011 Science University tree ensembles: an effective
  • 8. Algorithm Ensembles of Semi-Random Model Trees Ensembles usually improve results Most ensembles use randomization to generate diversity 2 sources of randomness: For each tree: divide data into a train and a validation set To split: select best attribute from a random subset of all attributes September 22nd , method 8 / 28 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011 Science University tree ensembles: an effective
  • 9. Algorithm Single Semi-Random Model Tree Only consider median as split value (=> balanced trees) Leaf model: linear ridge regression model Cap model predictions inside observed extremes Optimise tree depth and ridge value using the validation set September 22nd , method 9 / 28 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and scalable regression2011 Science University tree ensembles: an effective
  • 10. Algorithm Build ensemble BUILD E NSEMBLE (data, numTrees, k ) 1 for i = 1 to numTrees 2 do randomly split data into two: 3 train + validate 4 BUILD T REE (train, validate, k) Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 10 / 28
  • 11. Algorithm BuildTree BUILD T REE (train, validate, k) 1 min ← MIN TARGET VALUE(train) 2 max ← MAX TARGET VALUE(train) 3 localSSE ← LIN R EG(train, validate) 4 £ 5 if |train| > 10 & |validate| > 10 6 do split ← RANDOM S PLIT(train, k ) 7 £ 8 smT ← SMALLER(train, split) 9 smV ← SMALLER(validate, split) 10 smaller ← BUILD T REE(smT , smV , k ) 11 £ 12 laT ← LARGER(train, split) 13 laV ← LARGER(validate, split) 14 larger ← BUILD T REE(laT , laV , k ) Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 11 / 28
  • 12. Algorithm BuildTree, continued 15 subSSE ← SSE(smaller , larger , validate) 16 £ 17 if localSSE < subSSE 18 do smaller ← null 19 larger ← null 20 else 21 localModel ← null Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 12 / 28
  • 13. Algorithm Ridge regression LIN R EG (train, validate) 1 for ridge in 10−8 , 10−4 , 10−2 , 10−1 , 1, 10 2 do modelr ← RIDGE R EGRESS(train, ridge) 3 sser ← SSE(modelr , validate) 4 if bestModel == model10 5 do build models for ridge = 102 , 103 , ... 6 and so on while improving 7 localModel ← bestModel 8 return minimum-sse-on-validation-data Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 13 / 28
  • 14. Algorithm Random split selection RANDOM S PLIT (train, k) 1 for i = 1 to k 2 do splitAttr ← RANDOM CHOICE(allAttrs) 3 stump ← STUMP(APPROX MEDIAN(splitAttr )) 4 compute SSE(stump, train) 5 return minimum-sse stump Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 14 / 28
  • 15. Algorithm Parameter Settings reported experiments: average predictions of 50 randomized model trees to split select best of 50% randomly selected attributes generally: should optimise separately for every application, e.g. using cross-validation number of trees: “the more the merrier”, but diminishing returns number of randomly selected attributes: 50% is a good default, but may be depend on the total number and on sparseness Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 15 / 28
  • 16. Results Outline 1 Background 2 Algorithm 3 Results 4 Summary Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 16 / 28
  • 17. Results Comparison use more than 20 Torgo/UCI datasets, > 900 examples 2 1 repeated 3 training, 3 testing splits training split into equal build and validation halves ( 3 , 1 ) 1 3 preprocessed for missing or categorical values compare to: LR: linear ridge regression, optimise ridge value GP: gaussian process regression, optimise noise level and RBF gamma AG: additive groves, use ”fast” script use RMAE: relative mean absolute error Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 17 / 28
  • 18. Results RMAE on Torgo/UCI RMAE for Torgo/UCI data 100 90 RMT GP 80 LR 70 AG 60 50 40 30 20 10 0 e om re el s le es cp ng e M ev o u_ ct ile d lta pl s H k v s nk ll s ab nh L us nm 2H am nk s t us ol m H ou n de 2d n or l_ tor ba nt ba ma ak pu e_8 on m oc n _a rie N pu 16 8F cp _a ho p rm tu _e an ro ro ni i 32 a3 a8 us at e gr ho in8 co lay st qu al f ca va lo ex e_ s u le to ho m ai k co ct is el o rh lta lo de co Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 18 / 28
  • 19. Results Build times on Torgo/UCI Training time in seconds for Torgo/UCI data 100000 RMT 10000 GP LR AG 1000 100 10 1 0.1 s es ab ke nk s ns ile e ed ba 2nh ai rs us ing 2d _8L oc nts e k co isto ut no u_ t ho 16H cp M ki ll pu 8nm de um 2H le H rm am ev ol v cp _ac or a b a ro n oc on ur m _e N p to o 8F an sm a ro fri ni a3 us co me at lo lay gr xt lta a8 st qu al 3 va e e_ el u le n nk pl us te ho m o _a el l_ rh p ho lta ca lo de co Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 19 / 28
  • 20. Results UCI Census dataset Table: Partial results, 2458285 examples in total, therefore about 800000 in the training fold. Method RMAE Time (secs) LR 15.96 1205 RMT 9.78 19811 GP ? ? (would need 5 Tb RAM) AG ? ? (estimated 2000000) Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 20 / 28
  • 21. Results Near infrared (NIR) Datasets proprietary NIR data 7 datasets from 255 upto 7500 spectra between 170 and 500odd features preprocessed for noise and base line shift Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 21 / 28
  • 22. Results Sample NIR spectrum Prepocessed sample spectrum (nitrogen in soil) 4 3 2 1 0 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 141 148 155 162 169 -1 -2 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 22 / 28
  • 23. Results RMAE on NIR data RMAE for NIR datasets 90 RMT 80 GP 70 LR AG 60 50 40 30 20 10 n omd rmd tc phe ph p5 na g5 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 23 / 28
  • 24. Results Build times on NIR data Training time in seconds for NIR data 100000 10000 1000 RMT GP 100 LR AG 10 1 omd rmd na n tc ph phe p5 g5 0.1 Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 24 / 28
  • 25. Results Random Model Tree Build Times discussion complexity is O(K ∗ N ∗ logN + K 2 ∗ N) second term (linear model computation) seems to dominate therefore observed complexity ∼ O(K 2 ∗ N) Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 25 / 28
  • 26. Summary Outline 1 Background 2 Algorithm 3 Results 4 Summary Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 26 / 28
  • 27. Summary Conclusions Semi-Random Model Trees perform well They are fast: build time is practically linear in N Can model non-linear relationships Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 27 / 28
  • 28. Summary Future Work Improve efficiency for large K Study more and different regression problems More comparisons to alternative regression schemes Streaming/Moa variant Bernhard Pfahringer Department of ComputerSemi-random model of Waikato, New Zealand () and September 22nd , 2011 Science University tree ensembles: an effective scalable regression method 28 / 28