SlideShare a Scribd company logo
Nov. 23rd, 2009




{ On SSL, and beyond }
- Theories, Methods, and a Possible Suggestion on Semi-Supervised Learning -




                                      Lab Seminar Presentation
                                               Eunjeong Park
Agenda




1. Background


2. Semi-Supervised Learning Methods


3. Assumptions on SSL


4. Future Work
Agenda




1. Background


2. Semi-Supervised Learning Methods


3. Assumptions on SSL


4. Future Work
Background    Examples (1/2)

• Spam E-mail Classification




                                   inbox


              ?
                                  spam
Background     Examples (2/2)

• Response Modeling




                                  respondents


           ?
         unlabeled



                                  non-respondents
Background     the Question (1/2)

• Statistical learning methods require LOTS of training data

   – But since we only have a limited amount of labeled data,
   – Can we figure out a way for our learning algorithms to take
     advantage of all the unlabeled data?




      Labeled                 Unlabeled               …
Background           the Question (2/2)


                                  f: x→y
                         <xi, yi>                                   <xi> …?

•   Text/Web Mining                                    •   Marketing
     –   Document classification                            -   Response Modeling
           • f: Doc → Class                                      •   f: Demo+RFM → Response
           • Spam filtering, web page classification        -   Fraud Detection
     –   Information extraction
                                                                 •   f: Demo+PaymentHistory → Fraud
           • f: Sentence → Fact, f: Doc → Fact
                                                            -   Customer Segmentation
     –   Translation
           • f: EnglishDoc → FrenchDoc                           •   f: Demo+RFM → Customer Seg.
Agenda




1. Background


2. Semi-Supervised Learning Methods


3. Assumptions on SSL


4. Future Work
Semi-Supervised
       Learning          Methodology [1]

•   Generative models
     – Unlabeled data is used to to either modify or reprioritize hypotheses obtained from
       labeled data alone
     – Given the Bayesian formula:
                                p( x | y ) P( y )
                  P( y | x) =
                                     p( x)

       we can easily discover that p(x) influences p(y|x)
     – Mixture models with EM is in this category, and to some extent self-training, too

•   Discriminative models
     – Original discriminative training cannot be used for SSL, since p(y|x) is estimated
       ignoring p(x)
     – To solve the problem, p(x) dependent terms are often brought into the objective
       function, which amounts to assuming p(y|x) and p(x) share parameters
     – Transductive SVM, Gaussian processes, information regularization, graph-based
       methods are in this category


                                                      ※ For more on GM, DM refer to Appendix 1.
Semi-Supervised
       Learning          Previous methods

 SSL     Semi-Supervised Learning

  • EM w/ Generative Mixture Models (Nigam et al., 2000; Miller & Uyar, 1997)
      •Self-Training
  • Co-Training and Multiview Learning (Blum & Mitchell, 1998; Goldman & Zhou, 2000)
  • TSVMs (Bennett et al., 1999; Joachims, 1999)
      •Gaussian Processes
      •Information Regularization
      •Entropy Minimization
  • Graph-based methods (Blum & Chawla, 2001)




                                                                            Ref [1], [2] reorganized

                                    ※ For more on the use of above methods, refer to Appendix 2.
Semi-Supervised     Previous methods:
       Learning
                    EM w/Generative Models (1/3)
Basic EM Algorithm Incorporated w/ unlabeled data [3]
Semi-Supervised   Previous methods:
       Learning
                  EM w/Generative Models (2/3)
                    •   In a binary classification problem, if we assume each
                        class has a Gaussian distribution, then we can use
                        unlabeled data to help parameter estimation. [1]
Semi-Supervised   Previous methods:
       Learning
                  EM w/Generative Models (3/3)
Semi-Supervised      Previous methods:
       Learning
                     Co-Training (1/4)

     Professor Cho                   My Advisor
Semi-Supervised      Previous methods:
       Learning
                     Co-Training (2/4)
• Key Idea: Classifier1 and Classifier2 must…
    – Correctly classify labeled examples
    – Agree on classification of unlabeled



   Classifier 1: Hyperlinks only             Classifier 2: Page only


       Professor Cho     My Advisor
Semi-Supervised        Previous methods:
       Learning
                       Co-Training (3/4) [4]
•   Given: labeled data L, unlabeled data U
•   Loop:
     – Train g1 (hyperlink classifier) using L
     – Train g2 (page classifier) using L
     – Allow g1 to label p positive, n negative examples from U
     – Allow g2 to label p positive, n negative examples from U
     – Add these self-labeled examples to L

                     Answer1                        Answer2

                     Classifier1                    Classifier2



            Professor Cho    My Advisor
Semi-Supervised          Previous methods:
       Learning
                         Co-Training (4/4)
•   Experimental Settings:
     –   begin with 12 labeled web pages (academic course)
     –   provide 1,000 additional unlabeled web pages
     –   average error: learning from labeled data 11.1%;
     –   average error: cotraining 5.0%
Semi-Supervised   Previous methods:
       Learning
                  TSVMs


           +
                    +
                                      -
       +

                             -

                    +
                                          -
       +
                             -
Semi-Supervised    Previous methods:
       Learning
                   Graph-based methods
• Key idea: Define a graph where…
    – nodes are labeled and unlabeled examples in the dataset, and
    – edges (may be weighted) reflect the similarity of examples

    – Then, nodes connected by a large-weight edge tend to have the
      same label, and labels can propagation throughout the graph


• Note: Graph-based methods enjoy nice properties from spectral
  graph theory
Agenda




1. Background


2. Semi-Supervised Learning Methods


3. Assumptions on SSL


4. Future Work
Assumptions on
           SSL      The Utility of Unlabeled Data

• Many SSL papers start with an introduction like…
      “labeled data…is often very difficult and expensive to obtain, and
      thus…unlabeled data holds significant promise in terms of vastly
      expanding the applicability of learning methods [5]”
   …but is this necessarily true?
   – No! Do not take it for granted!
   – Even though you don’t to have to spend as much time labeling
     training data, you still need to spend much effort to design good
     models / features / kernels / similarity functions for SSL!


• A good matching of problem structure with model assumption is
  necessary to effectively use unlabeled data
   – Bad matching can lead to degradation in classifier performance
Assumptions on
           SSL             An Example (1/2)

• Unlabeled Data Can Degrade Classification Performance of
  Generative Classifiers [6] (1/2)




    Naive Bayes classifier from data generated from a Naive Bayes model (left) and a TAN model (right).
    Each point summarizes 10 runs of each classifier on testing data; bars cover 30 to 70 percentiles.
Assumptions on
           SSL     An Example (2/2)



                 Spam=0                      Spam=1




                                                      #of the word ‘Loan’




  Q1: Is this e-mail spam?
  Q2: Was this e-mail written on a Sunday?
Agenda




1. Problem Definition


2. Semi-Supervised Learning Methods


3. Assumptions on SSL


4. Future Work
Future Work     Multi-Edge Graph-Based SSL

• Aside to Semi-Supervised Classification, there are more…
   – Semi-Supervised Clustering
   – Semi-Supervised Regression


• There are also very similar methods such as…
   – Active learning


• Based on the theories noted above, here’s my question:


                       f: x→y
               <x1i> <x2i> <x3i> <x4i>
Future Work   Multi-Edge Graph-Based SSL

• Ex1:                 • Ex2:
Any Questions?




                 ?
Appendix 1          GM vs. DM

•   Discriminative models
     –   방법론: 결정경계의 도입
     –   PR이 처음 레이더 신호 해석에 쓰이기 시작하던 1950년대부터, 1990년대 중반까지 사실상
         PR을 대표하는 독점적인 방법이었음
     –   Rosenblat의 Perceptron(1958)과, PDP학파의 MLP(1986)역시 이러한 방향에서 주장된 것이
         었음


•   Generative models
     –   1996년, PDP학파의 핵심멤버였던 Geoffrey Hinton에 의해 처음 소개됨 (Hinton, G., Using
         Generative Models for Handwritten Digit Recognition, tPAMI, 1996.)
     –   이로 인해, clustering 정도 밖에 없다고 여겨졌던 unsupervised learning도 다시 조명을 받
         게 되었고, 곧 subspace analysis(ex: PCA)라는 우군을 얻게 되어 급격히 발전함
     –   즉, class의 위치가 반드시 서로 다른 class간에 떨어져 있으리란 법이 없으며, 따라서 그보다
         는 분포를 잘 묘사할 중심분포, 즉 혼재된 basis들로 기술해야한다는 관점임 (ex: 푸리에 급
         수)
Appendix 2           The Use of SSL Methods[1]

•   Do the classes produce well clustered data?
     –   EM w/ generative mixture models


•   Is the existing supervised classifier complicated and hard to modify?
     –   Self-training


•   Do the features naturally split into two sets?
     –   Co-training


•   Already using SVM?
     –   TSVMs


•   Is it true that two points with similar features tend to be in the same class?
     –   Graph-based methods
References


[1] Zhu, X., (2005). Semi-Supervised Learning Literature Survey, Computer Sciences,
    University of Wisconsin-Madison.
[2] Seeger, M., (2001). Learning with labeled and unlabeled data (Technical Survey).
[3] Nigam, K., McCallum, A. K., Mitchell, T. M., (2000). Text Classification from
    Labeled and Unlabeled Documents using EM, Machine Learning 39, 103-134.
[4] Mitchell, T. M., (1999). The Role of Unlabeled Data in Supervised Learning, Sixth
    International Colloquium on Cognitive Science.
[5] Raina, R., Battle, A., Packer, B., Ng, A. Y., (2007). Self-taught Learning: Transfer
    Learning from Unlabeled Data, 24th International Conference on Machine Learning.
[6] Cozman, F. G., Cohen, I., Cirelo M., (2002). Unlabeled data can degrade
    classification performance of generative classifiers, FLAIRS-02.
[7] Balcan, M., Blum, A., Choi, P. P., Lafferty, J., Pantano, B., Rwebangira, M. R.,
    Zhu, X., (2005). Person Identification in Webcam Images: An Application of Semi-
    Supervised Learning, Proc. of the 22 st ICML Workshop on Learning with Partially
    Classified Training Data, Bonn, Germany.

More Related Content

PPT
Semi-supervised Learning
PPTX
Semi supervised learning machine learning made simple
PPTX
Semi-Supervised Learning
PPT
Slides ppt
PPTX
Lecture 01: Machine Learning for Language Technology - Introduction
PDF
Lecture 2 Basic Concepts in Machine Learning for Language Technology
PPTX
End to-end semi-supervised object detection with soft teacher ver.1.0
PDF
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Semi-supervised Learning
Semi supervised learning machine learning made simple
Semi-Supervised Learning
Slides ppt
Lecture 01: Machine Learning for Language Technology - Introduction
Lecture 2 Basic Concepts in Machine Learning for Language Technology
End to-end semi-supervised object detection with soft teacher ver.1.0
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation

What's hot (20)

PDF
Lecture 1: What is Machine Learning?
PPT
Basics of Machine Learning
PPTX
Introduction to-machine-learning
PPT
Machine Learning Applications in NLP.ppt
PPTX
Introduction to Machine Learning
PPT
Machine Learning presentation.
PDF
Making Machine Learning Work in Practice - StampedeCon 2014
PDF
Self training improves_nlu
PDF
ML Basics
PPT
MLlecture1.ppt
PPTX
Machine learning (ML) and natural language processing (NLP)
PPT
MachineLearning.ppt
PDF
Lecture 1: Introduction to the Course (Practical Information)
PPTX
Primer to Machine Learning
PDF
Le Machine Learning de A à Z
PDF
Introduction to machine learning and deep learning
PPTX
Introduction to Machine Learning
PDF
Generating Natural-Language Text with Neural Networks
PPTX
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
PDF
MaxEnt (Loglinear) Models - Overview
Lecture 1: What is Machine Learning?
Basics of Machine Learning
Introduction to-machine-learning
Machine Learning Applications in NLP.ppt
Introduction to Machine Learning
Machine Learning presentation.
Making Machine Learning Work in Practice - StampedeCon 2014
Self training improves_nlu
ML Basics
MLlecture1.ppt
Machine learning (ML) and natural language processing (NLP)
MachineLearning.ppt
Lecture 1: Introduction to the Course (Practical Information)
Primer to Machine Learning
Le Machine Learning de A à Z
Introduction to machine learning and deep learning
Introduction to Machine Learning
Generating Natural-Language Text with Neural Networks
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
MaxEnt (Loglinear) Models - Overview
Ad

Viewers also liked (20)

PPTX
The beginner’s guide to 웹 크롤링 (스크래핑)
PDF
한국어와 NLTK, Gensim의 만남
PDF
자바, 미안하다! 파이썬 한국어 NLP
PDF
Introduction to Data Mining for Newbies
PPTX
딥러닝을 이용한 자연어처리의 연구동향
KEY
6장 지능형 웹 크롤링
PPTX
Selenium을 이용한 동적 사이트 크롤러 만들기
PDF
[Week2] 데이터 스크래핑
PPTX
Normalization 방법
PPTX
머신러닝의 자연어 처리기술(I)
PPTX
Web Crawler 고군분투기
PDF
도도와 파이썬: 좋은 선택과 나쁜 선택
PPTX
Learning to remember rare events
PPTX
Q Learning과 CNN을 이용한 Object Localization
PPTX
141118 최창원 웹크롤러제작
PDF
텐서플로 걸음마 (TensorFlow Tutorial)
PPTX
A neural image caption generator
PPTX
Python study 1강 (오픈소스컨설팅 내부 강의)
PDF
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
PPTX
MNIST for ML beginners
The beginner’s guide to 웹 크롤링 (스크래핑)
한국어와 NLTK, Gensim의 만남
자바, 미안하다! 파이썬 한국어 NLP
Introduction to Data Mining for Newbies
딥러닝을 이용한 자연어처리의 연구동향
6장 지능형 웹 크롤링
Selenium을 이용한 동적 사이트 크롤러 만들기
[Week2] 데이터 스크래핑
Normalization 방법
머신러닝의 자연어 처리기술(I)
Web Crawler 고군분투기
도도와 파이썬: 좋은 선택과 나쁜 선택
Learning to remember rare events
Q Learning과 CNN을 이용한 Object Localization
141118 최창원 웹크롤러제작
텐서플로 걸음마 (TensorFlow Tutorial)
A neural image caption generator
Python study 1강 (오픈소스컨설팅 내부 강의)
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
MNIST for ML beginners
Ad

Similar to On Semi-Supervised Learning and Beyond (20)

PPTX
Deep Neural Networks in Text Classification using Active Learning
PPT
learning.ppt
PDF
2015EDM: Feature-Aware Student Knowledge Tracing Tutorial
PDF
Tutorial on Coreference Resolution
PPTX
Transfer learning-presentation
PPT
model of learning and teaching.ppt active lerarning
PPTX
Semi-supervised Learning Survey - 20 years of evaluation
PPTX
Bridging the gap between closed and open items or how to make CALL more intel...
PPTX
Machine learning --Introduction.pptx
PPTX
Machine_Learning.pptx
PPT
MAchine learning
PPT
PPT-3.ppt
PPT
Machine Learning Machine Learnin Machine Learningg
PDF
International Journal of Engineering Research and Development (IJERD)
PPT
i i believe is is enviromntbelieve is is enviromnt7.ppt
PPTX
FLEAT VI - Harvard University - Piet Desmet & Bert Wylin
PPTX
Learning Analytics: Realizing their Promise in the California State University
PDF
Network Metrics and Measurements in the Era of the Digital Economies
PPTX
Types of machine learning.pptx
Deep Neural Networks in Text Classification using Active Learning
learning.ppt
2015EDM: Feature-Aware Student Knowledge Tracing Tutorial
Tutorial on Coreference Resolution
Transfer learning-presentation
model of learning and teaching.ppt active lerarning
Semi-supervised Learning Survey - 20 years of evaluation
Bridging the gap between closed and open items or how to make CALL more intel...
Machine learning --Introduction.pptx
Machine_Learning.pptx
MAchine learning
PPT-3.ppt
Machine Learning Machine Learnin Machine Learningg
International Journal of Engineering Research and Development (IJERD)
i i believe is is enviromntbelieve is is enviromnt7.ppt
FLEAT VI - Harvard University - Piet Desmet & Bert Wylin
Learning Analytics: Realizing their Promise in the California State University
Network Metrics and Measurements in the Era of the Digital Economies
Types of machine learning.pptx

Recently uploaded (20)

PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Empathic Computing: Creating Shared Understanding
PDF
Unlocking AI with Model Context Protocol (MCP)
PPT
Teaching material agriculture food technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Approach and Philosophy of On baking technology
PDF
cuic standard and advanced reporting.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Spectroscopy.pptx food analysis technology
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Network Security Unit 5.pdf for BCA BBA.
NewMind AI Weekly Chronicles - August'25 Week I
Empathic Computing: Creating Shared Understanding
Unlocking AI with Model Context Protocol (MCP)
Teaching material agriculture food technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Approach and Philosophy of On baking technology
cuic standard and advanced reporting.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
The Rise and Fall of 3GPP – Time for a Sabbatical?
Spectroscopy.pptx food analysis technology
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Electronic commerce courselecture one. Pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Per capita expenditure prediction using model stacking based on satellite ima...
Advanced methodologies resolving dimensionality complications for autism neur...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

On Semi-Supervised Learning and Beyond

  • 1. Nov. 23rd, 2009 { On SSL, and beyond } - Theories, Methods, and a Possible Suggestion on Semi-Supervised Learning - Lab Seminar Presentation Eunjeong Park
  • 2. Agenda 1. Background 2. Semi-Supervised Learning Methods 3. Assumptions on SSL 4. Future Work
  • 3. Agenda 1. Background 2. Semi-Supervised Learning Methods 3. Assumptions on SSL 4. Future Work
  • 4. Background Examples (1/2) • Spam E-mail Classification inbox ? spam
  • 5. Background Examples (2/2) • Response Modeling respondents ? unlabeled non-respondents
  • 6. Background the Question (1/2) • Statistical learning methods require LOTS of training data – But since we only have a limited amount of labeled data, – Can we figure out a way for our learning algorithms to take advantage of all the unlabeled data? Labeled Unlabeled …
  • 7. Background the Question (2/2) f: x→y <xi, yi> <xi> …? • Text/Web Mining • Marketing – Document classification - Response Modeling • f: Doc → Class • f: Demo+RFM → Response • Spam filtering, web page classification - Fraud Detection – Information extraction • f: Demo+PaymentHistory → Fraud • f: Sentence → Fact, f: Doc → Fact - Customer Segmentation – Translation • f: EnglishDoc → FrenchDoc • f: Demo+RFM → Customer Seg.
  • 8. Agenda 1. Background 2. Semi-Supervised Learning Methods 3. Assumptions on SSL 4. Future Work
  • 9. Semi-Supervised Learning Methodology [1] • Generative models – Unlabeled data is used to to either modify or reprioritize hypotheses obtained from labeled data alone – Given the Bayesian formula: p( x | y ) P( y ) P( y | x) = p( x) we can easily discover that p(x) influences p(y|x) – Mixture models with EM is in this category, and to some extent self-training, too • Discriminative models – Original discriminative training cannot be used for SSL, since p(y|x) is estimated ignoring p(x) – To solve the problem, p(x) dependent terms are often brought into the objective function, which amounts to assuming p(y|x) and p(x) share parameters – Transductive SVM, Gaussian processes, information regularization, graph-based methods are in this category ※ For more on GM, DM refer to Appendix 1.
  • 10. Semi-Supervised Learning Previous methods SSL Semi-Supervised Learning • EM w/ Generative Mixture Models (Nigam et al., 2000; Miller & Uyar, 1997) •Self-Training • Co-Training and Multiview Learning (Blum & Mitchell, 1998; Goldman & Zhou, 2000) • TSVMs (Bennett et al., 1999; Joachims, 1999) •Gaussian Processes •Information Regularization •Entropy Minimization • Graph-based methods (Blum & Chawla, 2001) Ref [1], [2] reorganized ※ For more on the use of above methods, refer to Appendix 2.
  • 11. Semi-Supervised Previous methods: Learning EM w/Generative Models (1/3) Basic EM Algorithm Incorporated w/ unlabeled data [3]
  • 12. Semi-Supervised Previous methods: Learning EM w/Generative Models (2/3) • In a binary classification problem, if we assume each class has a Gaussian distribution, then we can use unlabeled data to help parameter estimation. [1]
  • 13. Semi-Supervised Previous methods: Learning EM w/Generative Models (3/3)
  • 14. Semi-Supervised Previous methods: Learning Co-Training (1/4) Professor Cho My Advisor
  • 15. Semi-Supervised Previous methods: Learning Co-Training (2/4) • Key Idea: Classifier1 and Classifier2 must… – Correctly classify labeled examples – Agree on classification of unlabeled Classifier 1: Hyperlinks only Classifier 2: Page only Professor Cho My Advisor
  • 16. Semi-Supervised Previous methods: Learning Co-Training (3/4) [4] • Given: labeled data L, unlabeled data U • Loop: – Train g1 (hyperlink classifier) using L – Train g2 (page classifier) using L – Allow g1 to label p positive, n negative examples from U – Allow g2 to label p positive, n negative examples from U – Add these self-labeled examples to L Answer1 Answer2 Classifier1 Classifier2 Professor Cho My Advisor
  • 17. Semi-Supervised Previous methods: Learning Co-Training (4/4) • Experimental Settings: – begin with 12 labeled web pages (academic course) – provide 1,000 additional unlabeled web pages – average error: learning from labeled data 11.1%; – average error: cotraining 5.0%
  • 18. Semi-Supervised Previous methods: Learning TSVMs + + - + - + - + -
  • 19. Semi-Supervised Previous methods: Learning Graph-based methods • Key idea: Define a graph where… – nodes are labeled and unlabeled examples in the dataset, and – edges (may be weighted) reflect the similarity of examples – Then, nodes connected by a large-weight edge tend to have the same label, and labels can propagation throughout the graph • Note: Graph-based methods enjoy nice properties from spectral graph theory
  • 20. Agenda 1. Background 2. Semi-Supervised Learning Methods 3. Assumptions on SSL 4. Future Work
  • 21. Assumptions on SSL The Utility of Unlabeled Data • Many SSL papers start with an introduction like… “labeled data…is often very difficult and expensive to obtain, and thus…unlabeled data holds significant promise in terms of vastly expanding the applicability of learning methods [5]” …but is this necessarily true? – No! Do not take it for granted! – Even though you don’t to have to spend as much time labeling training data, you still need to spend much effort to design good models / features / kernels / similarity functions for SSL! • A good matching of problem structure with model assumption is necessary to effectively use unlabeled data – Bad matching can lead to degradation in classifier performance
  • 22. Assumptions on SSL An Example (1/2) • Unlabeled Data Can Degrade Classification Performance of Generative Classifiers [6] (1/2) Naive Bayes classifier from data generated from a Naive Bayes model (left) and a TAN model (right). Each point summarizes 10 runs of each classifier on testing data; bars cover 30 to 70 percentiles.
  • 23. Assumptions on SSL An Example (2/2) Spam=0 Spam=1 #of the word ‘Loan’ Q1: Is this e-mail spam? Q2: Was this e-mail written on a Sunday?
  • 24. Agenda 1. Problem Definition 2. Semi-Supervised Learning Methods 3. Assumptions on SSL 4. Future Work
  • 25. Future Work Multi-Edge Graph-Based SSL • Aside to Semi-Supervised Classification, there are more… – Semi-Supervised Clustering – Semi-Supervised Regression • There are also very similar methods such as… – Active learning • Based on the theories noted above, here’s my question: f: x→y <x1i> <x2i> <x3i> <x4i>
  • 26. Future Work Multi-Edge Graph-Based SSL • Ex1: • Ex2:
  • 28. Appendix 1 GM vs. DM • Discriminative models – 방법론: 결정경계의 도입 – PR이 처음 레이더 신호 해석에 쓰이기 시작하던 1950년대부터, 1990년대 중반까지 사실상 PR을 대표하는 독점적인 방법이었음 – Rosenblat의 Perceptron(1958)과, PDP학파의 MLP(1986)역시 이러한 방향에서 주장된 것이 었음 • Generative models – 1996년, PDP학파의 핵심멤버였던 Geoffrey Hinton에 의해 처음 소개됨 (Hinton, G., Using Generative Models for Handwritten Digit Recognition, tPAMI, 1996.) – 이로 인해, clustering 정도 밖에 없다고 여겨졌던 unsupervised learning도 다시 조명을 받 게 되었고, 곧 subspace analysis(ex: PCA)라는 우군을 얻게 되어 급격히 발전함 – 즉, class의 위치가 반드시 서로 다른 class간에 떨어져 있으리란 법이 없으며, 따라서 그보다 는 분포를 잘 묘사할 중심분포, 즉 혼재된 basis들로 기술해야한다는 관점임 (ex: 푸리에 급 수)
  • 29. Appendix 2 The Use of SSL Methods[1] • Do the classes produce well clustered data? – EM w/ generative mixture models • Is the existing supervised classifier complicated and hard to modify? – Self-training • Do the features naturally split into two sets? – Co-training • Already using SVM? – TSVMs • Is it true that two points with similar features tend to be in the same class? – Graph-based methods
  • 30. References [1] Zhu, X., (2005). Semi-Supervised Learning Literature Survey, Computer Sciences, University of Wisconsin-Madison. [2] Seeger, M., (2001). Learning with labeled and unlabeled data (Technical Survey). [3] Nigam, K., McCallum, A. K., Mitchell, T. M., (2000). Text Classification from Labeled and Unlabeled Documents using EM, Machine Learning 39, 103-134. [4] Mitchell, T. M., (1999). The Role of Unlabeled Data in Supervised Learning, Sixth International Colloquium on Cognitive Science. [5] Raina, R., Battle, A., Packer, B., Ng, A. Y., (2007). Self-taught Learning: Transfer Learning from Unlabeled Data, 24th International Conference on Machine Learning. [6] Cozman, F. G., Cohen, I., Cirelo M., (2002). Unlabeled data can degrade classification performance of generative classifiers, FLAIRS-02. [7] Balcan, M., Blum, A., Choi, P. P., Lafferty, J., Pantano, B., Rwebangira, M. R., Zhu, X., (2005). Person Identification in Webcam Images: An Application of Semi- Supervised Learning, Proc. of the 22 st ICML Workshop on Learning with Partially Classified Training Data, Bonn, Germany.