SlideShare a Scribd company logo
On the Limitations
of
Representing Functions on Sets
Kaushalya Madhawa

Murata Group
[Wagstaff+, ICML 2019]
Contributions
“We point out that some of the previous
work is therefore of limited practical
relevance, but regard it as mathematically
interesting.”
!2
Functions on sets?
•Permutation-invariance 

•Permutation-equivariance
f ({ }), , f ({ }), ,=
f ({ }), , { f ( }),( ),(= )
f ({ }), , { f ( )}),( ),(=
!3
Sum-decomposition
• Proposed in “Deep Sets” architecture [Zaheer+, 2017]

• is sum-decomposable via Z for some 

• and can be modeled with NNs
!4
f(X) = ρ (Σx∈Xϕ(x))
+ ρϕ
X ∈ ℝM
ℝNxM
ℝN ℝ
Input Output
x1
xM
Z
f(x1, …, xM)
ϕ(x1)
ϕ(xM)
ϕ : ℝ → Z
ρ : Z → ℝ
ϕ : 𝔛 → Zf
* X is a set (no repeated elements)
ϕ ρ
Countable vs Uncountable domains
is either

• Countable: number of elements is smaller or equal to the
number of elements in 

• e.g. 

• Uncountable: number of elements is greater than the number of
elements in 

• e.g.
!5
ℕ
ℕ
ℕ, ℚ
𝔛
ℝ
What’s missing in “Deep Sets”?
• “Deep Sets” considers functions on countable domains

• Universal approximation theorem:

• “Any continuous function can be approximated by a
neural network.”

• The universal approximation theorem for neural networks
relies on continuity on
!6
ℝ
Continuity in a countable domain does not
guarantee continuity in an uncountable domain!
A theoretical guarantee of
continuity on is weak, and does
not imply continuity on 

• The universal approximation
theorem relies on continuity on 

• Example: is continuous at
every rational point in [0, ln4].
Discontinuous in
!7
ℝ
ℚ
ℝ
ℝ
ψ
Practical Implications
Necessary and sufficient conditions

1. A latent dimensionality of M is sufficient for representing
all continuous permutation-invariant functions on sets of
size ≤ M. 

2. To guarantee that all continuous permutation-invariant
functions can be represented for sets of size ≤ M, a
latent dimensionality of at least M is necessary.

The latent space in which the summation happens must be
chosen to have dimension at least M.
!8
Experiments
• Given a set of M elements, predict its median

• Input dimension M and latent dimension N varied 

• and modeled with MLPs

• The mapping doesn’t need to be injective. (solvable with N<M)
!9
100
101
102
103
N (latent dim)
10 2
10 1
100
RMSE
set size
15
30
60
100
200
300
400
500
0 100 200 300 400 500 600
set size M
0
20
40
60
80
100
criticallatentdimNc
ϕ ρ
Conclusion
• Importance of the continuity requirements on
uncountable domains

• Necessary and sufficient condition to model
universal function representations: Latent space
dimension should be at least as large as the
maximum input set size

• Permutation-equivariance is yet to be addressed
!10
References
• Wagstaff, Edward, et al. "On the Limitations of
Representing Functions on Sets." arXiv preprint arXiv:
1901.09006 (2019).

• Zaheer, Manzil, et al. "Deep sets." Advances in neural
information processing systems. 2017.

• DeepSets: Modeling Permutation Invariance https://
www.inference.vc/deepsets-modeling-permutation-
invariance/
!11

More Related Content

PPTX
Learning a nonlinear embedding by preserving class neibourhood structure 최종
PDF
Combinatorial optimization CO-4
PDF
Estimate of house price using statistical and neural network model
PDF
A Generalized Sampling Theorem Over Galois Field Domains for Experimental Des...
PPTX
1 sollins algorithm
PDF
Clustering:k-means, expect-maximization and gaussian mixture model
PPTX
Speaker Recognition using Gaussian Mixture Model
PDF
K-means and GMM
Learning a nonlinear embedding by preserving class neibourhood structure 최종
Combinatorial optimization CO-4
Estimate of house price using statistical and neural network model
A Generalized Sampling Theorem Over Galois Field Domains for Experimental Des...
1 sollins algorithm
Clustering:k-means, expect-maximization and gaussian mixture model
Speaker Recognition using Gaussian Mixture Model
K-means and GMM

What's hot (19)

PPTX
Expressions for shape functions of linear element
PPTX
Minimum Spanning Tree using Kruskal's Algorithm
PPTX
Minimum Spanning Tree Using Prism's Algorithm
PDF
TENSOR DECOMPOSITION WITH PYTHON
PDF
Usage of Different Matrix Operation for MIMO Communication
PDF
Combinatorial optimization CO-1
PPTX
Elasticity of demand
PPTX
unit-4-dynamic programming
PDF
Cs229 notes7a
PDF
Computer Graphics End Semester Question Paper
PPTX
Golden Section method
PDF
K-means, EM and Mixture models
PDF
Linear Size Meshes
PDF
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
PPT
Double Patterning
PPT
Lesson 1 integration as the inverse of differentiation
PPT
Double Patterning (4/2 update)
PDF
Understanding Random Forests: From Theory to Practice
PPTX
Information and network security 35 the chinese remainder theorem
Expressions for shape functions of linear element
Minimum Spanning Tree using Kruskal's Algorithm
Minimum Spanning Tree Using Prism's Algorithm
TENSOR DECOMPOSITION WITH PYTHON
Usage of Different Matrix Operation for MIMO Communication
Combinatorial optimization CO-1
Elasticity of demand
unit-4-dynamic programming
Cs229 notes7a
Computer Graphics End Semester Question Paper
Golden Section method
K-means, EM and Mixture models
Linear Size Meshes
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
Double Patterning
Lesson 1 integration as the inverse of differentiation
Double Patterning (4/2 update)
Understanding Random Forests: From Theory to Practice
Information and network security 35 the chinese remainder theorem
Ad

Similar to On the limitations of representing functions on sets (20)

PPTX
Smart Multitask Bregman Clustering
PDF
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
PDF
Random Matrix Theory and Machine Learning - Part 4
PPTX
Strassen's Matrix Multiplication divide and conquere algorithm
PDF
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
PDF
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
PDF
Iterative procedure for uniform continuous mapping.
PPTX
Unit-1 Basic Concept of Algorithm.pptx
PDF
Computational Intelligence Assisted Engineering Design Optimization (using MA...
PPTX
Tensor Spectral Clustering
PPTX
Introduction to matlab
PDF
Skiena algorithm 2007 lecture18 application of dynamic programming
PDF
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
PDF
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
PDF
Dycops2019
PDF
International Journal of Engineering Research and Development
PDF
Paper study: Learning to solve circuit sat
PDF
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
PDF
A GENERALIZED SAMPLING THEOREM OVER GALOIS FIELD DOMAINS FOR EXPERIMENTAL DESIGN
PPTX
Programming in python
Smart Multitask Bregman Clustering
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
Random Matrix Theory and Machine Learning - Part 4
Strassen's Matrix Multiplication divide and conquere algorithm
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
Iterative procedure for uniform continuous mapping.
Unit-1 Basic Concept of Algorithm.pptx
Computational Intelligence Assisted Engineering Design Optimization (using MA...
Tensor Spectral Clustering
Introduction to matlab
Skiena algorithm 2007 lecture18 application of dynamic programming
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
Dycops2019
International Journal of Engineering Research and Development
Paper study: Learning to solve circuit sat
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
A GENERALIZED SAMPLING THEOREM OVER GALOIS FIELD DOMAINS FOR EXPERIMENTAL DESIGN
Programming in python
Ad

More from Kaushalya Madhawa (9)

PDF
Graphs for Visual Understanding
PDF
Trends in DNN compression
PDF
Robustness of compressed CNNs
PPTX
Pruning convolutional neural networks for resource efficient inference
PDF
ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with ...
PDF
Opportunities in Higher Education & Career Guidance
PDF
Automatic generation of event summaries using microblog streams
PDF
Understanding social connections
PDF
Leveraging mobile network big data for urban planning
Graphs for Visual Understanding
Trends in DNN compression
Robustness of compressed CNNs
Pruning convolutional neural networks for resource efficient inference
ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with ...
Opportunities in Higher Education & Career Guidance
Automatic generation of event summaries using microblog streams
Understanding social connections
Leveraging mobile network big data for urban planning

Recently uploaded (20)

PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Programs and apps: productivity, graphics, security and other tools
PPT
Teaching material agriculture food technology
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Big Data Technologies - Introduction.pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Approach and Philosophy of On baking technology
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Spectroscopy.pptx food analysis technology
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Mobile App Security Testing_ A Comprehensive Guide.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Network Security Unit 5.pdf for BCA BBA.
Diabetes mellitus diagnosis method based random forest with bat algorithm
20250228 LYD VKU AI Blended-Learning.pptx
Chapter 3 Spatial Domain Image Processing.pdf
The AUB Centre for AI in Media Proposal.docx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Programs and apps: productivity, graphics, security and other tools
Teaching material agriculture food technology
sap open course for s4hana steps from ECC to s4
Big Data Technologies - Introduction.pptx
NewMind AI Weekly Chronicles - August'25-Week II
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Assigned Numbers - 2025 - Bluetooth® Document
Approach and Philosophy of On baking technology
MYSQL Presentation for SQL database connectivity
Spectroscopy.pptx food analysis technology
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf

On the limitations of representing functions on sets

  • 1. On the Limitations of Representing Functions on Sets Kaushalya Madhawa Murata Group [Wagstaff+, ICML 2019]
  • 2. Contributions “We point out that some of the previous work is therefore of limited practical relevance, but regard it as mathematically interesting.” !2
  • 3. Functions on sets? •Permutation-invariance •Permutation-equivariance f ({ }), , f ({ }), ,= f ({ }), , { f ( }),( ),(= ) f ({ }), , { f ( )}),( ),(= !3
  • 4. Sum-decomposition • Proposed in “Deep Sets” architecture [Zaheer+, 2017] • is sum-decomposable via Z for some • and can be modeled with NNs !4 f(X) = ρ (Σx∈Xϕ(x)) + ρϕ X ∈ ℝM ℝNxM ℝN ℝ Input Output x1 xM Z f(x1, …, xM) ϕ(x1) ϕ(xM) ϕ : ℝ → Z ρ : Z → ℝ ϕ : 𝔛 → Zf * X is a set (no repeated elements) ϕ ρ
  • 5. Countable vs Uncountable domains is either • Countable: number of elements is smaller or equal to the number of elements in • e.g. • Uncountable: number of elements is greater than the number of elements in • e.g. !5 ℕ ℕ ℕ, ℚ 𝔛 ℝ
  • 6. What’s missing in “Deep Sets”? • “Deep Sets” considers functions on countable domains • Universal approximation theorem: • “Any continuous function can be approximated by a neural network.” • The universal approximation theorem for neural networks relies on continuity on !6 ℝ
  • 7. Continuity in a countable domain does not guarantee continuity in an uncountable domain! A theoretical guarantee of continuity on is weak, and does not imply continuity on • The universal approximation theorem relies on continuity on • Example: is continuous at every rational point in [0, ln4]. Discontinuous in !7 ℝ ℚ ℝ ℝ ψ
  • 8. Practical Implications Necessary and sufficient conditions 1. A latent dimensionality of M is sufficient for representing all continuous permutation-invariant functions on sets of size ≤ M. 2. To guarantee that all continuous permutation-invariant functions can be represented for sets of size ≤ M, a latent dimensionality of at least M is necessary.
 The latent space in which the summation happens must be chosen to have dimension at least M. !8
  • 9. Experiments • Given a set of M elements, predict its median • Input dimension M and latent dimension N varied • and modeled with MLPs • The mapping doesn’t need to be injective. (solvable with N<M) !9 100 101 102 103 N (latent dim) 10 2 10 1 100 RMSE set size 15 30 60 100 200 300 400 500 0 100 200 300 400 500 600 set size M 0 20 40 60 80 100 criticallatentdimNc ϕ ρ
  • 10. Conclusion • Importance of the continuity requirements on uncountable domains • Necessary and sufficient condition to model universal function representations: Latent space dimension should be at least as large as the maximum input set size • Permutation-equivariance is yet to be addressed !10
  • 11. References • Wagstaff, Edward, et al. "On the Limitations of Representing Functions on Sets." arXiv preprint arXiv: 1901.09006 (2019). • Zaheer, Manzil, et al. "Deep sets." Advances in neural information processing systems. 2017. • DeepSets: Modeling Permutation Invariance https:// www.inference.vc/deepsets-modeling-permutation- invariance/ !11