Novelty generation with deep learning

Novelty generation with deep learning
Presented by : Cherti Mehdi
joint work with Balázs Kégl and Akın Kazakçı

• Research questions
• Why machine learning and deep learning ?
• Generating new types of objects (C) using
previously acquired knowledge (K)
• Evaluating novelty with out-of-class generation
metrics
• Perspectives
Roadmap

Summary
• Recently, generative models have gained
momentum. But such models are almost
exclusively used in a prediction pipeline.
• Our objective is to study
• a) whether such models can be used to generate
novelty
• b) how to evaluate their capacity for generating
novelty

Research questions
• What is meant by the generation of novelty?
• How can novelty be generated?
• How can a model generating novelty be evaluated?

Why machine learning and deep learning ?
• Knowledge is important : machine learning enable
the study of creativity in relation with knowledge
• Generative modeling: we want to generate objects
• Composition of features is important : deep
learning models can automatically learn a
hierarchy of features of growing abstraction
from raw data
I focus my thesis on deep generative models.

Generating new types of objects
In Kazakçı et al. 2016:
• We show that symbols of new types can be
generated by carefully tuned autoencoders
• We make a ﬁrst step of deﬁning the conceptual
and experimental framework of novelty
generation

Generating new types of objects:
autoencoders
• Autoencoders have existed for a long
time (Kramer 1991)
• Deep variants are more recent (Hinton,
Salakhutdinov, 2006; Bengio 2009)
• A deep autoencoder learns
successive transformations that
decompose and then recompose a
set of training objects
• The depth allows learning a hierarchy
of transformations
• Two ways of learning an autoencoder :
undercomplete (bottleneck) and
overcomplete representation
Slide adapted from from Kazakçı et al. 2016

autoencoders with undercomplete
representation
Reconstruction
Input (dim 625)
Bottleneck
Encode
Decode
Deep autoencoder with a bottleneck from Hinton, G. E., & Salakhutdinov, R. R. (2006).

autoencoders with overcomplete
representation
• Autoencoders can also be learned using an
overcomplete representation
• Problem : Risk of learning the identity function
• One solution : constrain the representation to be
“simple”
• Example : enforce sparsity of the representation with
sparse autoencoders

autoencoders with overcomplete
representation
• What does a sparse autoencoder
end up learning ?
• Detect features with the encode
function
• Superpose the detected features
in the reconstructed image with
the decode function
• Beneﬁts of overcomplete
representation with sparsity: for
each image only a small fraction
of features are used but different
images use a different subset of
features
k is the sparsity rate in %
Figure taken from Makhzani, A., & Frey, B. (2013)

experimental setup
• Training data : MNIST, 70000
images of handwritten digits of
size 28x28
• We use a sparse convolutional
autoencoder trained to:
• Encode : take an image and
transform it to a sparse code
• Decode : take the sparse code
and reconstruct the image
• Training objective is to minimize
the reconstruction error

generating new symbols
• We use an iterative method to build symbols the net has never seen
(inspired by Bengio et al. (2013) but we don’t try to avoid spurious
samples):
• Start with a random image
• force the network to construct (i.e. interpret)
• , until convergence, f(x) = decode(encode(x))

generating new symbols
• What does the iterative
generation procedure do ?
• It’s a non-linear path on the input
space defined by the
autoencoder (encode + decode)
function
• It converges to fixed points
defined by the autoencoder
Figure taken from Alain and Bengio (2013)

Visualization of the structure of generated
images
• Colored clusters are original
digits (classes from 0 to 9)
• The gray dots are newly
generated objects
• New objects form new
clusters
• Using a clustering algorithm,
we recover coherent sets of
new symbols

Generating new types of objects
In Kazakçı et al. 2016:
• We show that symbols of new types can be generated by
carefully tuned autoencoders
• We make a ﬁrst step of deﬁning a conceptual and
experimental framework of novelty generation
• However, we make no attempt to design evaluation metrics
A set of types (clusters) discovered by the model

Evaluating novelty
In “Out-of-class novelty generation: an experimental
foundation” :
• We design an experimental framework based on hold-
out classes
• We review and analyze the most common evaluation
techniques from the point of view of measuring
“out-of-distribution novelty” and propose new ones
• We run a large-scale experimentation to study the
capacity for generating novelty of a wide set of
generative models

Evaluating novelty
Experimental framework
• We contrast two main concepts : in-class and out-of-
class generation
• in-class generation: can a model re-generate the types
already seen in the dataset ? (traditional objective)
• out-of-class generation : can a model generate an
unseen (hold-out) set of types ? (a proxy to measure the
capacity of a model to generate novelty)
• setup : we train models on a set of types(in), we seek
for models that generate a hold-out set of types(out)

Evaluating novelty
Evaluation metrics
• In our experiments:
• We train models on
digits
• We seek for models that
generate letters
in-class:
out-of-class:

• We pre-train a
• digit classifier (0 to 9)
• a letter classifier (a to z)
• a classifier on a mixture of digits and letters
• Our evaluation metrics report a score for a set of
generated objects by a model
Evaluating novelty
Evaluation metrics

Given a set of images, out-of-class objectness is high if:
• the letter classifier is highly confident for each
image being one of the letters (a to z)
• we define in-class objectness similarly but using the
digit classifier
Evaluating novelty
Evaluation metrics

Given a set of images, out-of-class max and out-of-class
count are high if:
• the mixture of digits and letters classifier is highly
confident for each image being one of the letters (a
to z)
• we define in-class max and in-class count similarly
but for digits
Evaluating novelty
Evaluation metrics

• We do a large scale experiment where we train
~1000 models by varying their parameters
• from each model, we generate 1000 images, then
we evaluate the model using our proposed metrics
• We collect a total of ~1.000.000 generated images
Experiments

• We evaluate the
evaluators with
human assessment
• We build an
annotation tool to
check whether the
models selected by
our evaluation
metrics are
effectively good
Experiments
Evaluating the evaluators

• we found that selecting models for in-class generation will make
them memorize the classes they are trained to sample from
• we did succeed to ﬁnd models which lead to out-of-class novelty
• Pangram obtained from the above model:
Experiments
Results

• The main focus was setting up the experimental
pipeline and to analyze various quality metrics,
designed to measure out-of-distribution novelty
• The immediate next goal is to analyze the models
in a systematic way
Perspectives

Novelty generation with deep learning

More Related Content

Viewers also liked (8)

Similar to Novelty generation with deep learning (20)

Recently uploaded (20)

Novelty generation with deep learning