SlideShare a Scribd company logo
Datamining 2nd decisiontree
Datamining 2nd decisiontree
Datamining 2nd decisiontree
(1/3)


•
    •
•
    •
    •
(2/3)
•               (
    Classification, Pattern Recognition)

    •
    •
    •      A
(3/3)
•                       (Clustering)
    •
    •   A                 B
    •           A            B
•                   (Association Rules)
    •
    •       A                  B
    •
Datamining 2nd decisiontree
Datamining 2nd decisiontree
•
•




    9
•




        K


    L


            10
•
    •
    •
•
    •


                    T                T’
              Yes       No     Yes        No




        (A)                  (B)               11
•   T     v∈T                   cost(v)           v       T
                 .
           
               {cost(x) | x ∈ T is leaf}
5.4.                                                                 133

                                  X1


                         X2                X3


                     1        0        1            X4
                : 2           2        2
                                                1            0

                                                3            3        12
13
5.1
                EXACT COVER BY 3-SET
 •          NP                                 NP
      •     NP             NP            EXACT EXACT COVER BY 3-SET
                                               COVER BY 3-SET
                                                        EXACT COVER BY 3-SET

 •     EXACT COVER BY 3-SET
            5.2 3                                        X     3                X
                     S = {T 1, T 2, ...}                                         S1 ⊂ S
                                                NP
         (1) ∪{T |T ∈ S1 } = X
      5.4.                           X                  S1                              135
134                                                 5
          (2)         i=j                  Ti ∩ Tj = φ S 1

          5.4     EXACT COVER BY 3-SET
                         1     2     3                        {{1, 4, 7}, {2, 3, 5}, {6, 8, 9}}
                                    X                                   EXACT COVER
      BY 3-SET             4         5              6        EXACT COVER
           :                      NP                          BY 3-SET
                           7               8        9

                                                                                                  14
BY 3-SET
    •:                                 NP
            •
            •
    X                                            1                               |X|
            •       X Y = {y1 , y2 , ..., y|X| }                    |X|                        0   Y
        t           X            Y             Y = {y1 , y2 , . . . X |X| }
                                                                    ,y                    1            Y
            •       t 0X               Y                                     X            1    Y           0

                                                         1    t∈X
                                           t[A] =
                                                         0    t∈Y

X               3                                         T1 , T 2 , . . .
    1                              0                                                      yi
                           1

                                           1    t ∈ Ti                       1   t = yi
0 t∈Y
•X X     3
                  3
                                                      T1 , T 2 , . . . T1 , T 2 , . . .
    1                    1        0                  0                                    yi
•   yi                  1                  1
                                          1 t ∈ Ti                   1 t = yi
                       t[Ti ] =                      , t[yi ] =
                                          0 t ∈ Ti                   0 t = yi

•
    Ti       yi
•
•                 2
    •                                                    Ti
         9                            3                                    5.5(A)
    •                 5.5(B)
•
    •                      |X|
                                   |X| |X|
                     1 + 2 + ··· +    +
                                    3   3
        •
        •         EXACT COVER BY 3-SET
            136    EXACT COVER BY 3-SET5
                                           :              :
Y                     T1         T2   T3
                                               yi                       yi
                                                     T1
            |Y | = |X|                                                  9
            5.6(A)                             1          T2




                                                     1         T3




                     1 + 2 + · · · + |X| + |X|1                     0




                     (A)                       (B)
5.6(A)
•                            Ti             yi        yi             yi
          |Y| = |X|
•                1 + 2 + · · · + |X| + |X|

•     2                                 1 + 2 + ··· +
                                                      |X| |X|
                                                         +
                                                       3   3
•                 EXACT COVER BY 3-SET
                                       EXACT COVER BY 3-SET
                             1 + 2 + · · · + |X|/3 + |X|/3
                                                                                              NP
          5.4.                                                                     137

                                                           y1




                      5.4.2                      0              y2


                      y1          y2   y3                                                NP
                                                           0




                                                                          y9



                                                                     0         1


                       (A)                           (B)
19
T                T’
      Yes       No     Yes        No




(A)                  (B)
                                       20
S = {(x1 , c1 ), (x2 , c2 ), . . . , (xN , cN )}


     H(C) = −p log2 p − p× log2 p×

     p   p×
p
                                                             21
•                     4        6
              p   =    , p× =
                     10        10
• H(C)   = −p log2 p − p× log2 p×
             4       4   6       6
         = − log2      −   log2     = 0.971
            10      10 10        10




                                              22
•




             30
                  YES   NO
    C:             2    2    4
         ×         2    4    6
                   4    6    10   23
T1: 30
                         YES       NO
   C:                     2         2       4
                 ×        2         4       6
                          4         6      10
                           2       2 2    2
  H(C | T1 = Yes)    =    − log2    − log2 = 1.0
                           4       4 4    4
                           2       2 4    4
   H(C | T1 = No)    =    − log2    − log2 = 0.918
                           6       6 6    6
  •
              4                   6
H(C | T1 ) =    H(C | T1 = Yes) + H(C | T1 = No) = 0.951
             10                  10

                                                      24
•        T                                 I(T)

             I(T ) = H(C) − H(C | T )
•
    I(T1 ) = H(C) − H(C | T1 ) = 0.971 − 0.951 = 0.020
•
    I(T2 ) = 0.420, I(T3 ) = 0.091, I(T4 ) = 0.420
•                                         T2
                       T2
    •   T4
        T2

                                                     25
T2:

        Yes         No




•
    •
    •
•
•    Yes


            4    4 2    2
    H(C) = − log2 − log2 = 0.918
            6    6 6    6



                                                           
           T1                      4     2    2 2       2
                      H(C | T1 ) =     − log2 − log2
                                   6 4       4 4      4
                                   2     2    2
                                       − log2
                                   6     2    2
                                 = 0.667
                         I(T1 ) = 0.918 − 0.667 = 0.251
                          I(T3 ) = 0, I(T4 ) = 0.918
                                               T4           27
•
•
    •
        •   naive bayes
    •
        •
    •
•
    •
    •
        •
    •                     29
•
•
    •
    •   ID3
                                               2


•
    •
    •   CART (Classification And Regression Tree)   C4.5


                                                          30
•   CART
    •           2
    •
    •
•   C4.5
    •
    •
•
    •
        •
    •
            Forest
                     31
(10/21)

•              sesejun+dm10@sel.is.ocha.ac.jp
•
•            11/2(   )
•
    •   http://togodb.sel.is.ocha.ac.jp/        22
                                       2010



                                                     32
33

More Related Content

PDF
Datamining 2nd Decisiontree
PDF
Fixed point theorem of discontinuity and weak compatibility in non complete n...
PDF
11.fixed point theorem of discontinuity and weak compatibility in non complet...
PDF
1010n3a
PDF
Random Matrix Theory and Machine Learning - Part 2
PDF
Boris Blagov. Financial Crises and Time-Varying Risk Premia in a Small Open E...
PPT
BS2506 tutorial 1
PDF
Random Matrix Theory and Machine Learning - Part 1
Datamining 2nd Decisiontree
Fixed point theorem of discontinuity and weak compatibility in non complete n...
11.fixed point theorem of discontinuity and weak compatibility in non complet...
1010n3a
Random Matrix Theory and Machine Learning - Part 2
Boris Blagov. Financial Crises and Time-Varying Risk Premia in a Small Open E...
BS2506 tutorial 1
Random Matrix Theory and Machine Learning - Part 1

What's hot (19)

PDF
Random Matrix Theory and Machine Learning - Part 4
PDF
Random Matrix Theory and Machine Learning - Part 3
PPT
BS2506 tutorial3
DOCX
ฟังก์ชัน(function)
PDF
2003 Ames.Models
DOCX
ฟังก์ชัน 1
DOCX
PDF
Chapter 4: Modern Location Theory of the Firm
PDF
Asset Prices in Segmented and Integrated Markets
PDF
State Space Model
PDF
Sequential Selection of Correlated Ads by POMDPs
PDF
501 lecture8
PDF
BS2506 tutorial 2
PDF
Midsem sol 2013
PDF
Research Inventy : International Journal of Engineering and Science
PDF
Nonlinear Filtering and Path Integral Method (Paper Review)
PDF
03 finding roots
PPT
Final Present Pap1on relibility
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 3
BS2506 tutorial3
ฟังก์ชัน(function)
2003 Ames.Models
ฟังก์ชัน 1
Chapter 4: Modern Location Theory of the Firm
Asset Prices in Segmented and Integrated Markets
State Space Model
Sequential Selection of Correlated Ads by POMDPs
501 lecture8
BS2506 tutorial 2
Midsem sol 2013
Research Inventy : International Journal of Engineering and Science
Nonlinear Filtering and Path Integral Method (Paper Review)
03 finding roots
Final Present Pap1on relibility
Ad

Viewers also liked (9)

PDF
bioinfolec_20070706 4th
PDF
Datamining 5th Knn
PDF
Ohp Seijoen H20 02 Hensu To Kata
PDF
Datamining 9th Association Rule
PDF
080806
PDF
Datamining 3rd Naivebayes
PDF
Datamining r 2nd
PDF
Datamining 5th knn
PDF
080806
bioinfolec_20070706 4th
Datamining 5th Knn
Ohp Seijoen H20 02 Hensu To Kata
Datamining 9th Association Rule
080806
Datamining 3rd Naivebayes
Datamining r 2nd
Datamining 5th knn
080806
Ad

Similar to Datamining 2nd decisiontree (20)

PDF
rinko2011-agh
PDF
Datamining 7th kmeans
PDF
Datamining 7th Kmeans
PDF
Parameter Estimation in Stochastic Differential Equations by Continuous Optim...
DOCX
แบบฝึกทักษะฟังก์ชัน(เพิ่มเติม)ตัวจริง
KEY
集合知プログラミングゼミ第1回
KEY
Tprimal agh
PDF
SJK seminar 110606 v2
PDF
One way to see higher dimensional surface
DOCX
เซต
PPTX
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
PPTX
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
PDF
Dsp U Lec07 Realization Of Discrete Time Systems
PDF
Datamining 9th association_rule.key
PDF
TaPL名古屋 Chap2
PPTX
numarial analysis presentation
PDF
Tutorial 9 mth 3201
PDF
Dsp U Lec10 DFT And FFT
PDF
Ih2414591461
PDF
Add maths 2
rinko2011-agh
Datamining 7th kmeans
Datamining 7th Kmeans
Parameter Estimation in Stochastic Differential Equations by Continuous Optim...
แบบฝึกทักษะฟังก์ชัน(เพิ่มเติม)ตัวจริง
集合知プログラミングゼミ第1回
Tprimal agh
SJK seminar 110606 v2
One way to see higher dimensional surface
เซต
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
Dsp U Lec07 Realization Of Discrete Time Systems
Datamining 9th association_rule.key
TaPL名古屋 Chap2
numarial analysis presentation
Tutorial 9 mth 3201
Dsp U Lec10 DFT And FFT
Ih2414591461
Add maths 2

More from sesejun (20)

PDF
RNAseqによる変動遺伝子抽出の統計: A Review
PDF
バイオインフォマティクスによる遺伝子発現解析
PDF
次世代シーケンサが求める機械学習
PDF
20110602labseminar pub
PDF
20110524zurichngs 2nd pub
PDF
20110524zurichngs 1st pub
PDF
20110214nips2010 read
PDF
Datamining 8th hclustering
PDF
Datamining r 4th
PDF
Datamining r 3rd
PDF
Datamining r 1st
PDF
Datamining 6th svm
PDF
Datamining 4th adaboost
PDF
Datamining 3rd naivebayes
PDF
100401 Bioinfoinfra
PDF
Datamining 8th Hclustering
PDF
Datamining 9th Association Rule
PDF
Datamining 8th Hclustering
PDF
Datamining R 4th
PDF
Datamining 6th Svm
RNAseqによる変動遺伝子抽出の統計: A Review
バイオインフォマティクスによる遺伝子発現解析
次世代シーケンサが求める機械学習
20110602labseminar pub
20110524zurichngs 2nd pub
20110524zurichngs 1st pub
20110214nips2010 read
Datamining 8th hclustering
Datamining r 4th
Datamining r 3rd
Datamining r 1st
Datamining 6th svm
Datamining 4th adaboost
Datamining 3rd naivebayes
100401 Bioinfoinfra
Datamining 8th Hclustering
Datamining 9th Association Rule
Datamining 8th Hclustering
Datamining R 4th
Datamining 6th Svm

Datamining 2nd decisiontree

  • 4. (1/3) • • • • •
  • 5. (2/3) • ( Classification, Pattern Recognition) • • • A
  • 6. (3/3) • (Clustering) • • A B • A B • (Association Rules) • • A B •
  • 10. K L 10
  • 11. • • • • T T’ Yes No Yes No (A) (B) 11
  • 12. T v∈T cost(v) v T . {cost(x) | x ∈ T is leaf} 5.4. 133 X1 X2 X3 1 0 1 X4 : 2 2 2 1 0 3 3 12
  • 13. 13
  • 14. 5.1 EXACT COVER BY 3-SET • NP NP • NP NP EXACT EXACT COVER BY 3-SET COVER BY 3-SET EXACT COVER BY 3-SET • EXACT COVER BY 3-SET 5.2 3 X 3 X S = {T 1, T 2, ...} S1 ⊂ S NP (1) ∪{T |T ∈ S1 } = X 5.4. X S1 135 134 5 (2) i=j Ti ∩ Tj = φ S 1 5.4 EXACT COVER BY 3-SET 1 2 3 {{1, 4, 7}, {2, 3, 5}, {6, 8, 9}} X EXACT COVER BY 3-SET 4 5 6 EXACT COVER : NP BY 3-SET 7 8 9 14
  • 15. BY 3-SET •: NP • • X 1 |X| • X Y = {y1 , y2 , ..., y|X| } |X| 0 Y t X Y Y = {y1 , y2 , . . . X |X| } ,y 1 Y • t 0X Y X 1 Y 0 1 t∈X t[A] = 0 t∈Y X 3 T1 , T 2 , . . . 1 0 yi 1 1 t ∈ Ti 1 t = yi
  • 16. 0 t∈Y •X X 3 3 T1 , T 2 , . . . T1 , T 2 , . . . 1 1 0 0 yi • yi 1 1 1 t ∈ Ti 1 t = yi t[Ti ] = , t[yi ] = 0 t ∈ Ti 0 t = yi • Ti yi • • 2 • Ti 9 3 5.5(A) • 5.5(B)
  • 17. • |X| |X| |X| 1 + 2 + ··· + + 3 3 • • EXACT COVER BY 3-SET 136 EXACT COVER BY 3-SET5 : : Y T1 T2 T3 yi yi T1 |Y | = |X| 9 5.6(A) 1 T2 1 T3 1 + 2 + · · · + |X| + |X|1 0 (A) (B)
  • 18. 5.6(A) • Ti yi yi yi |Y| = |X| • 1 + 2 + · · · + |X| + |X| • 2 1 + 2 + ··· + |X| |X| + 3 3 • EXACT COVER BY 3-SET EXACT COVER BY 3-SET 1 + 2 + · · · + |X|/3 + |X|/3 NP 5.4. 137 y1 5.4.2 0 y2 y1 y2 y3 NP 0 y9 0 1 (A) (B)
  • 19. 19
  • 20. T T’ Yes No Yes No (A) (B) 20
  • 21. S = {(x1 , c1 ), (x2 , c2 ), . . . , (xN , cN )} H(C) = −p log2 p − p× log2 p× p p× p 21
  • 22. 4 6 p = , p× = 10 10 • H(C) = −p log2 p − p× log2 p× 4 4 6 6 = − log2 − log2 = 0.971 10 10 10 10 22
  • 23. 30 YES NO C: 2 2 4 × 2 4 6 4 6 10 23
  • 24. T1: 30 YES NO C: 2 2 4 × 2 4 6 4 6 10 2 2 2 2 H(C | T1 = Yes) = − log2 − log2 = 1.0 4 4 4 4 2 2 4 4 H(C | T1 = No) = − log2 − log2 = 0.918 6 6 6 6 • 4 6 H(C | T1 ) = H(C | T1 = Yes) + H(C | T1 = No) = 0.951 10 10 24
  • 25. T I(T) I(T ) = H(C) − H(C | T ) • I(T1 ) = H(C) − H(C | T1 ) = 0.971 − 0.951 = 0.020 • I(T2 ) = 0.420, I(T3 ) = 0.091, I(T4 ) = 0.420 • T2 T2 • T4 T2 25
  • 26. T2: Yes No • • • •
  • 27. Yes 4 4 2 2 H(C) = − log2 − log2 = 0.918 6 6 6 6 T1 4 2 2 2 2 H(C | T1 ) = − log2 − log2 6 4 4 4 4 2 2 2 − log2 6 2 2 = 0.667 I(T1 ) = 0.918 − 0.667 = 0.251 I(T3 ) = 0, I(T4 ) = 0.918 T4 27
  • 28.
  • 29. • • naive bayes • • • • • • • • 29
  • 30. • • • • ID3 2 • • • CART (Classification And Regression Tree) C4.5 30
  • 31. CART • 2 • • • C4.5 • • • • • • Forest 31
  • 32. (10/21) • sesejun+dm10@sel.is.ocha.ac.jp • • 11/2( ) • • http://togodb.sel.is.ocha.ac.jp/ 22 2010 32
  • 33. 33