SlideShare a Scribd company logo
Convex Hull Approximation of
Nearly Optimal Lasso Solutions
1
Satoshi Hara Takanori Maehara
PRICAI’19
Background Lasso and Enumeration
n Lasso Typical approach for feature selection
min
$
1
2
'( − * + + - ( . =: 1 ( , ', * ∈ ℝ5 ×7×ℝ5
n Enumeration for feature selection [Hara & Maehara, AAAI’17]
• Helpful for gaining more insights of data.
2
Ordinary Lasso
• One global optimum, i.e.,
one feature set, is obtained.
Enumeration of Lasso
• Several possible solutions, i.e.,
multiple feature sets, are obtained.
I found one feature set that is
helpful for predicting energy
consumption.
Found:
{Wall Area, Glazing Area}
I found several feature sets
that are helpful for predicting
energy consumption.
Found:
{Wall Area, Glazing Area},
{Wall Area, Overall Height},
{Roof Area, Glazing Area}, …
Background Lasso and Enumeration
n Example Lasso Enumeration for 20Newsdata
• Identifying relevant words for article classification.
3
Selected words
in Lasso
solution
adb apple bios bus cable com controller
dos drivers duo fpu gateway ibm ide
mac motherboard simm vlb vram windows
Background Lasso and Enumeration
n Example Lasso Enumeration for 20Newsdata
• Identifying relevant words for article classification.
4
Selected words
in Lasso
solution
adb apple bios bus cable com controller
dos drivers duo fpu gateway ibm ide
mac motherboard simm vlb vram windows
Model7
Remove motherboard
cable
adb
drivers
Model8
Remove motherboard
cable
adb
drivers
Model9
Remove motherboard
cable
adb
drivers
Model4
Remove motherboard
cable
adb
drivers
Model5
Remove motherboard
cable
adb
drivers
Model6
Remove motherboard
cable
adb
drivers
Model1
Remove motherboard
cable
adb
drivers
Model2
Remove motherboard
cable
adb
drivers
Model3
Remove motherboard
cable
adb
drivers
Enumerated Models
Background Lasso and Enumeration
n Example Lasso Enumeration for 20Newsdata
• Identifying relevant words for article classification.
5
Selected
words in
Lasso
solution
adb apple bios bus cable com controller
dos drivers duo fpu gateway ibm ide
mac motherboard simm vlb vram windows
Model7
Remove motherboard
cable
adb
drivers
Model8
Remove motherboard
cable
adb
drivers
Model9
Remove motherboard
cable
adb
drivers
Model4
Remove motherboard
cable
adb
drivers
Model5
Remove motherboard
cable
adb
drivers
Model6
Remove motherboard
cable
adb
drivers
Model1
Remove motherboard
cable
adb
drivers
Model2
Remove motherboard
cable
adb
drivers
Model3
Remove motherboard
cable
adb
drivers
Enumerated Models
Drawback of Enumeration
Enumerated models can be just a combination
of a few representative patterns.
Exponentially many combinations of similar models can be found.
These similar models are not helpful for gaining insights.
Goal of This Study
n Goal
Find small numbers of diverse models.
large numbers similar
n Overview of the Proposed Approach
• Define a set of good models.
! " ≔ $: & $ ≤ "
• Find vertices of ! " .
Vertices = sparse models
Vertices are distinct -> diversity
6
! !Enumeration
" "
Outline
n Background and Overview
n Problem Formulation
n Proposed Method
n Experiments
n Summary
7
Properties of ! "
n ! " ≔ $: & $ ≔
'
(
)$ − + (
+ - $ ' ≤ "
• A set of models with sufficiently small Lasso objectives.
1. ! " consists of smooth boundaries
and non-smooth vertices.
• Smooth boundaries = dense models
• Non-smooth vertices = sparse models
2. A convex hull of the set of vertices /
can approximate ! " well.
• conv / ≈ ! "
8
Problem Approximation of ! "
n Our Approach
Approximate !(") by a set of % points & = () )*+
,
.
n To attain good approximation,
the vertices - of !(") should
be selected as &.
9
Problem Approximation of ! "
n Our Approach
Approximate !(") by a set of % points & = () )*+
,
.
n Question How to measure the approximation quality?
10
!(")
How similar
they are?
We use Hausdorff distance.
& = () )*+
,
Problem Approximation of ! "
n Def. Hausdorff distance between the two sets.
• Maximum margin in the non-overlapping region.
#$ %, %′ ≔ max sup
/∈1
inf
/5∈15
6 − 68 , sup
/8∈18
inf
/∈1
6 − 68
n We measure the approximation quality by using #$.
Problem Minimization of Hausdorff distance
min
9
#$ conv = , !(") , s. t. = ≤ C
11
%
%′
!(")
conv =Measure #$ = = EF FGH
I
Outline
n Background and Overview
n Problem Formulation
n Proposed Method
n Experiments
n Summary
12
Method Sampling + Greedy Selection
n Step1 Sampling points from the boundary of ! "
n Step2 Greedily select # points to minimize $%.
13
Step1 Sampling Step2 Greedy Selection
Step1 Sampling
n Note Want to sample vertices as much as possible.
n Proposed Sampling Method
• Take a random direction.
• Find an “edge” of ! " at that direction.
14
This method can sample
vertices with high probabilities.
Step1 Sampling
n Finding an “edge”
max$ %&', s. t. ' ∈ -(/) (%: random direction)
n Finding an “edge” by binary search
• Dual Problem
min
345
max
$
%&' − 7(8 ' − /)
• Find ' that satisfies 8 ' = /
by finding optimal 7 by using
binary search.
15
solvable with Lasso solvers
large 7
small 7
optimal 7
Method Sampling + Greedy Selection
n Step1 Sampling points from the boundary of ! "
n Step2 Greedily select # points to minimize $%.
16
Step1 Sampling Step2 Greedy Selection
Step2 Greedy Selection
n Original Problem
min
$
%& conv * , ,(.) , s. t. * ≤ 4
n Approximate , . with the sampled points 5.
• , . ≈ conv 5
min
$⊆8
%& conv * , conv 5 , s. t. * ≤ 4
• Remark
%& conv * , conv 5 = max
<∈8
min
<>∈?@AB $
C − C′
17
,(.)
conv(5)
conv *Measure %&
≈
Step2 Greedy Selection
n The problem is NP-hard in general.
• min
$⊆&
'( conv , , conv . , s. t. , ≤ 3
n Our Approach Greedy Selection
• Initialization step
Select one point 4 ∈ .
, 6 ← 4 , . ← . ∖ {4}, and ; ← 1
• While ; < 3
>4 ∈ max
A∈&
min
AB∈CDEF $ G
4 − 4′
, JK6 ← , J ∪ >4 , . ← . ∖ { >4}, and ; ← ; + 1
18
conv(.)
conv(,)
Greedily add one point to ,
that minimizes the objective.
Step2 Greedy Selection
n The problem is NP-hard in general.
• min
$⊆&
'( conv , , conv . , s. t. , ≤ 3
n Our Approach Greedy Selection
• Initialization step
Select one point 4 ∈ .
, 6 ← 4 , . ← . ∖ {4}, and ; ← 1
• While ; < 3
>4 ∈ max
A∈&
min
AB∈CDEF $ G
4 − 4′
, JK6 ← , J ∪ >4 , . ← . ∖ { >4}, and ; ← ; + 1
19
conv(.)
conv(,)
Greedily add one point to ,
that minimizes the objective.
Step2 Greedy Selection
n The problem is NP-hard in general.
• min
$⊆&
'( conv , , conv . , s. t. , ≤ 3
n Our Approach Greedy Selection
• Initialization step
Select one point 4 ∈ .
, 6 ← 4 , . ← . ∖ {4}, and ; ← 1
• While ; < 3
>4 ∈ max
A∈&
min
AB∈CDEF $ G
4 − 4′
, JK6 ← , J ∪ >4 , . ← . ∖ { >4}, and ; ← ; + 1
20
conv(.)
conv(,)
Greedily add one point to ,
that minimizes the objective.
Step2 Greedy Selection
n The problem is NP-hard in general.
• min
$⊆&
'( conv , , conv . , s. t. , ≤ 3
n Our Approach Greedy Selection
• Initialization step
Select one point 4 ∈ .
, 6 ← 4 , . ← . ∖ {4}, and ; ← 1
• While ; < 3
>4 ∈ max
A∈&
min
AB∈CDEF $ G
4 − 4′
, JK6 ← , J ∪ >4 , . ← . ∖ { >4}, and ; ← ; + 1
21
conv(.)
conv(,)
Greedily add one point to ,
that minimizes the objective.
Step2 Greedy Selection
n Details of computing !" ∈ max
'∈(
min
'+∈,-./ 0 1
" − "′
n 1. Computing min Quadratic Programming (QP)
• min
'+∈,-./ 0 1
" − "′
⇔ min
5
" − 6
7
8797 , s. t. 8 ≥ 0, 6
7
87 = 1
n 2. Computing max Lazy Update
• A naïve implementation requires searching over all " ∈ B.
• By using a monotonicity of the Hausdorff distance, we
can skip redundant computations and accelerate the
search.
22
Method Sampling + Greedy Selection
n Step1 Sampling points from the boundary of ! "
• Sampling random directions + Lasso + Binary Search
n Step2 Greedily select # points to minimize $%.
• Greedy selection
23
Step1 Sampling Step2 Greedy Selection
Outline
n Background and Overview
n Problem Formulation
n Proposed Method
n Experiments
n Summary
24
Synthetic Experiment Visualization of ! " and #
n Synthetic Problems
• 2D ver. $ =
1 1
1 1 + 1/40
, , =
1
1
• 3D ver. $ =
1 1 1
1 1 + 1/40 1
1 1 1 + 2/40
, , =
1
1
1
n Results
25
2D ver. 3D ver. Hausdorff dist.
2D ver.
3D ver.
Synthetic Experiment High-dimensional Data
n Synthetic data
• ! = #$% + '
• % ∼ ) 0, , , ,-. = exp −0.1|6 − 7|
• dimensionality of % = 100
n Result
• Huadorff dist. decreases
as 8 increases.
• Huadorff dist. decreases
as the sampling size 9 increases.
The effect is marginal, though.
In practice, 9 ≈ 1,000 would suffice.
26
Real-Data Experiment Diversity verification
n Data: 20Newsgroups
• Classification of news articles into two categories.
(ibm or mac)
• Feature selection = Identification of important words.
! ∈ ℝ$$%&': tf-idf weighted bag-of-words
( ∈ {0, 1}: categories of articles
# of data: 1168
n Model
• Linear logistic regression + ℓ$
n Baseline Methods [Hara & Maehara, AAAI’17]
• Enumeration Exact enumeration of top-K models
• Heuristic Skip similar models while enumeration.
27
Real-Data Experiment Diversity verification
n Comparison of the found 500 models
n Visualization with PCA
• Projected found models with PCA.
• The proposed method attained
the largest diversity.
28
Found Words
Enumeration 39
Heuristic 63
Proposed 889
apple macs macintosh
Enumeration ✘ ✘
Heuristic ✘
Proposed
Baseline methods found
combinations of a few
representative patterns only.
Baseline methods missed
some important words.
Summary
n Our Goal
• Find small numbers of diverse models for Lasso.
n Our Method
• Find “vertices” of a set of models ! " ≔ $: & $ ≤ "
• Problem: Hausdorff distance minimization.
• Method: Sampling + Greedy Selection
n Verified the effectiveness of the proposed method.
• The proposed method could
find points that can well approximate ! " .
obtain diverse models than the existing enumeration
methods.
29
GitHub: /sato9hara/LassoHull

More Related Content

PDF
Reinforcement Learning @ NeurIPS2018
PDF
20140306 ibisml
PDF
サポートベクトルデータ記述法による異常検知 in 機械学習プロフェッショナルシリーズ輪読会
PDF
サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会
PPTX
Greed is Good: 劣モジュラ関数最大化とその発展
PPTX
Triplet Loss 徹底解説
PDF
プログラミングコンテストでのデータ構造
PPTX
[DL輪読会]Focal Loss for Dense Object Detection
Reinforcement Learning @ NeurIPS2018
20140306 ibisml
サポートベクトルデータ記述法による異常検知 in 機械学習プロフェッショナルシリーズ輪読会
サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会
Greed is Good: 劣モジュラ関数最大化とその発展
Triplet Loss 徹底解説
プログラミングコンテストでのデータ構造
[DL輪読会]Focal Loss for Dense Object Detection

What's hot (20)

PDF
特徴選択のためのLasso解列挙
PPTX
劣モジュラ最適化と機械学習1章
PDF
幾何コンテスト2013
PDF
Re永続データ構造が分からない人のためのスライド
PPTX
fastTextの実装を見てみた
PDF
パターン認識02 k平均法ver2.0
PDF
最小カットを使って「燃やす埋める問題」を解く
PDF
星野「調査観察データの統計科学」第1&2章
PDF
プログラミングコンテストでのデータ構造 2 ~平衡二分探索木編~
PDF
AtCoder Beginner Contest 023 解説
PDF
様々な全域木問題
PDF
グラフィカル Lasso を用いた異常検知
PDF
はじめてのパターン認識輪読会 10章後半
PDF
AtCoder Beginner Contest 035 解説
PPTX
Chokudai search
PPTX
劣モジュラ最適化と機械学習 2.5節
PDF
強化学習@PyData.Tokyo
PPTX
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
PDF
星野「調査観察データの統計科学」第3章
PDF
AtCoder Regular Contest 038 解説
特徴選択のためのLasso解列挙
劣モジュラ最適化と機械学習1章
幾何コンテスト2013
Re永続データ構造が分からない人のためのスライド
fastTextの実装を見てみた
パターン認識02 k平均法ver2.0
最小カットを使って「燃やす埋める問題」を解く
星野「調査観察データの統計科学」第1&2章
プログラミングコンテストでのデータ構造 2 ~平衡二分探索木編~
AtCoder Beginner Contest 023 解説
様々な全域木問題
グラフィカル Lasso を用いた異常検知
はじめてのパターン認識輪読会 10章後半
AtCoder Beginner Contest 035 解説
Chokudai search
劣モジュラ最適化と機械学習 2.5節
強化学習@PyData.Tokyo
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
星野「調査観察データの統計科学」第3章
AtCoder Regular Contest 038 解説
Ad

Similar to Convex Hull Approximation of Nearly Optimal Lasso Solutions (20)

PDF
clustering unsupervised learning and machine learning.pdf
PPT
Lecture1
PDF
lec9_annotated.pdf ml csci 567 vatsal sharan
PPTX
Design and Analysis of Algorithms Lecture Notes
PPTX
LINEAR PROGRAMMING
PDF
cutnpeel_wsdm2022_slide.pdf
PPTX
Lecture 8 about data mining and how to use it.pptx
PDF
Mit6 094 iap10_lec03
PPTX
Paris Data Geeks
PDF
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
PDF
Real-Time Big Data Stream Analytics
PPTX
Counting trees.pptx
PPTX
Undecidable Problems and Approximation Algorithms
PPT
Recursion
PPT
5163147.ppt
PDF
Derivative Free Optimization and Robust Optimization
PPTX
Decision Tree.pptx
clustering unsupervised learning and machine learning.pdf
Lecture1
lec9_annotated.pdf ml csci 567 vatsal sharan
Design and Analysis of Algorithms Lecture Notes
LINEAR PROGRAMMING
cutnpeel_wsdm2022_slide.pdf
Lecture 8 about data mining and how to use it.pptx
Mit6 094 iap10_lec03
Paris Data Geeks
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Real-Time Big Data Stream Analytics
Counting trees.pptx
Undecidable Problems and Approximation Algorithms
Recursion
5163147.ppt
Derivative Free Optimization and Robust Optimization
Decision Tree.pptx
Ad

More from Satoshi Hara (12)

PDF
Explanation in Machine Learning and Its Reliability
PDF
“機械学習の説明”の信頼性
PDF
【論文調査】XAI技術の効能を ユーザ実験で評価する研究
PDF
機械学習で嘘をつく話
PDF
機械学習モデルの判断根拠の説明(Ver.2)
PDF
異常の定義と推定
PDF
Theoretical Linear Convergence of Unfolded ISTA and its Practical Weights and...
PDF
機械学習モデルの判断根拠の説明
PDF
Maximally Invariant Data Perturbation as Explanation
PDF
アンサンブル木モデル解釈のためのモデル簡略化法
PDF
機械学習モデルの列挙
PDF
KDD'17読み会:Anomaly Detection with Robust Deep Autoencoders
Explanation in Machine Learning and Its Reliability
“機械学習の説明”の信頼性
【論文調査】XAI技術の効能を ユーザ実験で評価する研究
機械学習で嘘をつく話
機械学習モデルの判断根拠の説明(Ver.2)
異常の定義と推定
Theoretical Linear Convergence of Unfolded ISTA and its Practical Weights and...
機械学習モデルの判断根拠の説明
Maximally Invariant Data Perturbation as Explanation
アンサンブル木モデル解釈のためのモデル簡略化法
機械学習モデルの列挙
KDD'17読み会:Anomaly Detection with Robust Deep Autoencoders

Recently uploaded (20)

PDF
Electronic commerce courselecture one. Pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Encapsulation theory and applications.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPT
Teaching material agriculture food technology
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
Electronic commerce courselecture one. Pdf
Unlocking AI with Model Context Protocol (MCP)
Encapsulation theory and applications.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
MYSQL Presentation for SQL database connectivity
Spectroscopy.pptx food analysis technology
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Teaching material agriculture food technology
Chapter 3 Spatial Domain Image Processing.pdf
sap open course for s4hana steps from ECC to s4
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
MIND Revenue Release Quarter 2 2025 Press Release
Understanding_Digital_Forensics_Presentation.pptx
20250228 LYD VKU AI Blended-Learning.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Machine learning based COVID-19 study performance prediction
Digital-Transformation-Roadmap-for-Companies.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Reach Out and Touch Someone: Haptics and Empathic Computing

Convex Hull Approximation of Nearly Optimal Lasso Solutions

  • 1. Convex Hull Approximation of Nearly Optimal Lasso Solutions 1 Satoshi Hara Takanori Maehara PRICAI’19
  • 2. Background Lasso and Enumeration n Lasso Typical approach for feature selection min $ 1 2 '( − * + + - ( . =: 1 ( , ', * ∈ ℝ5 ×7×ℝ5 n Enumeration for feature selection [Hara & Maehara, AAAI’17] • Helpful for gaining more insights of data. 2 Ordinary Lasso • One global optimum, i.e., one feature set, is obtained. Enumeration of Lasso • Several possible solutions, i.e., multiple feature sets, are obtained. I found one feature set that is helpful for predicting energy consumption. Found: {Wall Area, Glazing Area} I found several feature sets that are helpful for predicting energy consumption. Found: {Wall Area, Glazing Area}, {Wall Area, Overall Height}, {Roof Area, Glazing Area}, …
  • 3. Background Lasso and Enumeration n Example Lasso Enumeration for 20Newsdata • Identifying relevant words for article classification. 3 Selected words in Lasso solution adb apple bios bus cable com controller dos drivers duo fpu gateway ibm ide mac motherboard simm vlb vram windows
  • 4. Background Lasso and Enumeration n Example Lasso Enumeration for 20Newsdata • Identifying relevant words for article classification. 4 Selected words in Lasso solution adb apple bios bus cable com controller dos drivers duo fpu gateway ibm ide mac motherboard simm vlb vram windows Model7 Remove motherboard cable adb drivers Model8 Remove motherboard cable adb drivers Model9 Remove motherboard cable adb drivers Model4 Remove motherboard cable adb drivers Model5 Remove motherboard cable adb drivers Model6 Remove motherboard cable adb drivers Model1 Remove motherboard cable adb drivers Model2 Remove motherboard cable adb drivers Model3 Remove motherboard cable adb drivers Enumerated Models
  • 5. Background Lasso and Enumeration n Example Lasso Enumeration for 20Newsdata • Identifying relevant words for article classification. 5 Selected words in Lasso solution adb apple bios bus cable com controller dos drivers duo fpu gateway ibm ide mac motherboard simm vlb vram windows Model7 Remove motherboard cable adb drivers Model8 Remove motherboard cable adb drivers Model9 Remove motherboard cable adb drivers Model4 Remove motherboard cable adb drivers Model5 Remove motherboard cable adb drivers Model6 Remove motherboard cable adb drivers Model1 Remove motherboard cable adb drivers Model2 Remove motherboard cable adb drivers Model3 Remove motherboard cable adb drivers Enumerated Models Drawback of Enumeration Enumerated models can be just a combination of a few representative patterns. Exponentially many combinations of similar models can be found. These similar models are not helpful for gaining insights.
  • 6. Goal of This Study n Goal Find small numbers of diverse models. large numbers similar n Overview of the Proposed Approach • Define a set of good models. ! " ≔ $: & $ ≤ " • Find vertices of ! " . Vertices = sparse models Vertices are distinct -> diversity 6 ! !Enumeration " "
  • 7. Outline n Background and Overview n Problem Formulation n Proposed Method n Experiments n Summary 7
  • 8. Properties of ! " n ! " ≔ $: & $ ≔ ' ( )$ − + ( + - $ ' ≤ " • A set of models with sufficiently small Lasso objectives. 1. ! " consists of smooth boundaries and non-smooth vertices. • Smooth boundaries = dense models • Non-smooth vertices = sparse models 2. A convex hull of the set of vertices / can approximate ! " well. • conv / ≈ ! " 8
  • 9. Problem Approximation of ! " n Our Approach Approximate !(") by a set of % points & = () )*+ , . n To attain good approximation, the vertices - of !(") should be selected as &. 9
  • 10. Problem Approximation of ! " n Our Approach Approximate !(") by a set of % points & = () )*+ , . n Question How to measure the approximation quality? 10 !(") How similar they are? We use Hausdorff distance. & = () )*+ ,
  • 11. Problem Approximation of ! " n Def. Hausdorff distance between the two sets. • Maximum margin in the non-overlapping region. #$ %, %′ ≔ max sup /∈1 inf /5∈15 6 − 68 , sup /8∈18 inf /∈1 6 − 68 n We measure the approximation quality by using #$. Problem Minimization of Hausdorff distance min 9 #$ conv = , !(") , s. t. = ≤ C 11 % %′ !(") conv =Measure #$ = = EF FGH I
  • 12. Outline n Background and Overview n Problem Formulation n Proposed Method n Experiments n Summary 12
  • 13. Method Sampling + Greedy Selection n Step1 Sampling points from the boundary of ! " n Step2 Greedily select # points to minimize $%. 13 Step1 Sampling Step2 Greedy Selection
  • 14. Step1 Sampling n Note Want to sample vertices as much as possible. n Proposed Sampling Method • Take a random direction. • Find an “edge” of ! " at that direction. 14 This method can sample vertices with high probabilities.
  • 15. Step1 Sampling n Finding an “edge” max$ %&', s. t. ' ∈ -(/) (%: random direction) n Finding an “edge” by binary search • Dual Problem min 345 max $ %&' − 7(8 ' − /) • Find ' that satisfies 8 ' = / by finding optimal 7 by using binary search. 15 solvable with Lasso solvers large 7 small 7 optimal 7
  • 16. Method Sampling + Greedy Selection n Step1 Sampling points from the boundary of ! " n Step2 Greedily select # points to minimize $%. 16 Step1 Sampling Step2 Greedy Selection
  • 17. Step2 Greedy Selection n Original Problem min $ %& conv * , ,(.) , s. t. * ≤ 4 n Approximate , . with the sampled points 5. • , . ≈ conv 5 min $⊆8 %& conv * , conv 5 , s. t. * ≤ 4 • Remark %& conv * , conv 5 = max <∈8 min <>∈?@AB $ C − C′ 17 ,(.) conv(5) conv *Measure %& ≈
  • 18. Step2 Greedy Selection n The problem is NP-hard in general. • min $⊆& '( conv , , conv . , s. t. , ≤ 3 n Our Approach Greedy Selection • Initialization step Select one point 4 ∈ . , 6 ← 4 , . ← . ∖ {4}, and ; ← 1 • While ; < 3 >4 ∈ max A∈& min AB∈CDEF $ G 4 − 4′ , JK6 ← , J ∪ >4 , . ← . ∖ { >4}, and ; ← ; + 1 18 conv(.) conv(,) Greedily add one point to , that minimizes the objective.
  • 19. Step2 Greedy Selection n The problem is NP-hard in general. • min $⊆& '( conv , , conv . , s. t. , ≤ 3 n Our Approach Greedy Selection • Initialization step Select one point 4 ∈ . , 6 ← 4 , . ← . ∖ {4}, and ; ← 1 • While ; < 3 >4 ∈ max A∈& min AB∈CDEF $ G 4 − 4′ , JK6 ← , J ∪ >4 , . ← . ∖ { >4}, and ; ← ; + 1 19 conv(.) conv(,) Greedily add one point to , that minimizes the objective.
  • 20. Step2 Greedy Selection n The problem is NP-hard in general. • min $⊆& '( conv , , conv . , s. t. , ≤ 3 n Our Approach Greedy Selection • Initialization step Select one point 4 ∈ . , 6 ← 4 , . ← . ∖ {4}, and ; ← 1 • While ; < 3 >4 ∈ max A∈& min AB∈CDEF $ G 4 − 4′ , JK6 ← , J ∪ >4 , . ← . ∖ { >4}, and ; ← ; + 1 20 conv(.) conv(,) Greedily add one point to , that minimizes the objective.
  • 21. Step2 Greedy Selection n The problem is NP-hard in general. • min $⊆& '( conv , , conv . , s. t. , ≤ 3 n Our Approach Greedy Selection • Initialization step Select one point 4 ∈ . , 6 ← 4 , . ← . ∖ {4}, and ; ← 1 • While ; < 3 >4 ∈ max A∈& min AB∈CDEF $ G 4 − 4′ , JK6 ← , J ∪ >4 , . ← . ∖ { >4}, and ; ← ; + 1 21 conv(.) conv(,) Greedily add one point to , that minimizes the objective.
  • 22. Step2 Greedy Selection n Details of computing !" ∈ max '∈( min '+∈,-./ 0 1 " − "′ n 1. Computing min Quadratic Programming (QP) • min '+∈,-./ 0 1 " − "′ ⇔ min 5 " − 6 7 8797 , s. t. 8 ≥ 0, 6 7 87 = 1 n 2. Computing max Lazy Update • A naïve implementation requires searching over all " ∈ B. • By using a monotonicity of the Hausdorff distance, we can skip redundant computations and accelerate the search. 22
  • 23. Method Sampling + Greedy Selection n Step1 Sampling points from the boundary of ! " • Sampling random directions + Lasso + Binary Search n Step2 Greedily select # points to minimize $%. • Greedy selection 23 Step1 Sampling Step2 Greedy Selection
  • 24. Outline n Background and Overview n Problem Formulation n Proposed Method n Experiments n Summary 24
  • 25. Synthetic Experiment Visualization of ! " and # n Synthetic Problems • 2D ver. $ = 1 1 1 1 + 1/40 , , = 1 1 • 3D ver. $ = 1 1 1 1 1 + 1/40 1 1 1 1 + 2/40 , , = 1 1 1 n Results 25 2D ver. 3D ver. Hausdorff dist. 2D ver. 3D ver.
  • 26. Synthetic Experiment High-dimensional Data n Synthetic data • ! = #$% + ' • % ∼ ) 0, , , ,-. = exp −0.1|6 − 7| • dimensionality of % = 100 n Result • Huadorff dist. decreases as 8 increases. • Huadorff dist. decreases as the sampling size 9 increases. The effect is marginal, though. In practice, 9 ≈ 1,000 would suffice. 26
  • 27. Real-Data Experiment Diversity verification n Data: 20Newsgroups • Classification of news articles into two categories. (ibm or mac) • Feature selection = Identification of important words. ! ∈ ℝ$$%&': tf-idf weighted bag-of-words ( ∈ {0, 1}: categories of articles # of data: 1168 n Model • Linear logistic regression + ℓ$ n Baseline Methods [Hara & Maehara, AAAI’17] • Enumeration Exact enumeration of top-K models • Heuristic Skip similar models while enumeration. 27
  • 28. Real-Data Experiment Diversity verification n Comparison of the found 500 models n Visualization with PCA • Projected found models with PCA. • The proposed method attained the largest diversity. 28 Found Words Enumeration 39 Heuristic 63 Proposed 889 apple macs macintosh Enumeration ✘ ✘ Heuristic ✘ Proposed Baseline methods found combinations of a few representative patterns only. Baseline methods missed some important words.
  • 29. Summary n Our Goal • Find small numbers of diverse models for Lasso. n Our Method • Find “vertices” of a set of models ! " ≔ $: & $ ≤ " • Problem: Hausdorff distance minimization. • Method: Sampling + Greedy Selection n Verified the effectiveness of the proposed method. • The proposed method could find points that can well approximate ! " . obtain diverse models than the existing enumeration methods. 29 GitHub: /sato9hara/LassoHull