A factorial study of neural network learning from differences for regression

@mdaquin - ICCBR 2022 - 14/09/2022
A factorial study of neural
network learning from
diﬀerences for regression
Mathieu d’Aquin, Emmanuel Nauer, Jean Lieber
K Team, LORIA, France
k.loria.fr – @mdaquin

@mdaquin - ICCBR 2022 - 14/09/2022

@mdaquin - ICCBR 2022 - 14/09/2022
Example
2.6
0.3
1.1
0.4
1.5

@mdaquin - ICCBR 2022 - 14/09/2022
Example
2.6
0.3
1.1
0.4
1.5
1.7

@mdaquin - ICCBR 2022 - 14/09/2022
Example
2.6
0.3
1.1
0.4
1.5
1.7
real value: 2.7

@mdaquin - ICCBR 2022 - 14/09/2022
1.7
real value: 2.7

@mdaquin - ICCBR 2022 - 14/09/2022
Example
2.6
0.3
1.1
0.4
1.5
2.1
real value: 2.7

@mdaquin - ICCBR 2022 - 14/09/2022
A
2.6
0.3
1.1
0.4
1.5
B
2.8
0.1
1.2
0.1
1.5
A-B
-0.2
0.2
0.1
0.3
0.0

@mdaquin - ICCBR 2022 - 14/09/2022
A
2.6
0.3
1.1
0.4
1.5
B
2.8
0.1
1.2
0.1
1.5
A-B
-0.2
0.2
0.1
0.3
0.0
0.2
real value A: 2.7
real value B: 2.8
VA-VB: -0.1

@mdaquin - ICCBR 2022 - 14/09/2022
What happens when we do that?
How does it aﬀect the performance of the network?
How does it aﬀect the settings of hyperparameters?

@mdaquin - ICCBR 2022 - 14/09/2022
Case difference heuristic (CDH)
In CBR, corresponds to learning how to derive differences in
solutions from differences in problems of the case base.
Several approaches, including:
- Rule mining in the case base (adaptation knowledge
acquisition)
- Training a neural network on differences between problems
and solutions

@mdaquin - ICCBR 2022 - 14/09/2022
Some variants:
- Train the network with diﬀerences Pb1-Pb2 from the case
base, where Pb2 is a problem similar to Pb1. Predict
SolTarget-SolSource from PbTarget-PbSource
- Use multiple pairs Pb1-Pb2i
for training. Average the
predictions from multiple PbTarget-PbSourcei
- Train on (Pb1, Pb1-Pb2), and predict using (PbTarget,
PbTarget-PbSource)

@mdaquin - ICCBR 2022 - 14/09/2022
Some existing results
Liao and Liu, AIKE 2018: Feasibility and impact on the size of the
case base used (training data).
Leake et al, AAAI 2021: CDH with a neural network outperforms
KNN and CDH with rules.
Ye et al., ICCBR 2021: CDH in some cases can better generalise
(perform better on novel queries).

@mdaquin - ICCBR 2022 - 14/09/2022
Does it achieve the same/better performance?
Does it require less data for training?
Does it converge faster or slower?
Should we use 1 case or multiple cases for training
and when applying the model?

@mdaquin - ICCBR 2022 - 14/09/2022
A factorial study
Training a neural network in the typical way and from
diﬀerences on multiple tasks/datasets varying several
factors to get answers to the previous questions.

@mdaquin - ICCBR 2022 - 14/09/2022
model, year, fuel type, transmission type,
fuel consumption, mileage, tax band
price sold.
frequency, angle, chord length, velocity,
displacement, thickness noise level
demographics, family situation, transport,
etc. grade obtained
date and time, duration, origin and
destination, airline, number of stops
price of seat
The datasets
Used cars:
Variants:
Toyota and Vauxhall
Airfoil:
Students:
Variants: Maths and
Portuguese
Flights:
Variants: Economy
and Business

@mdaquin - ICCBR 2022 - 14/09/2022
The factors varied
test_size: The dataset (before computing differences) is split into a
training set and a test set. The higher the test set size, the lower the amount
of data used for training.
epochs: The number of iterations over the dataset given to the neural
network for training.
ntr: For any pb1 in the training set (case base), the number of pbi used to
create the difference-based training set from (pb1-pbi, sol1-soli)
nte: The number of similar cases used in the case base (training set) from
which to predict the difference for a given target problem. Results are
averaged over all nte solutions obtained.
use context: Whether or not the original features of the problem are
included with the difference at training time and prediction time.

@mdaquin - ICCBR 2022 - 14/09/2022
The factors varied

@mdaquin - ICCBR 2022 - 14/09/2022
The factors varied
Almost 100K models trained both in a typical way, from
diﬀerences and from diﬀerences+context.
x2
x2
x2
19
19
19
19
20
40
12
10
5
5
5
5
5
10
10
10

@mdaquin - ICCBR 2022 - 14/09/2022
So, does it need less data?
R2
Flights economy
Base NN
Diﬀ. no context
Diﬀ. + context

@mdaquin - ICCBR 2022 - 14/09/2022
So, does it need less data?
Best results for students/maths
Best results for airfoil

@mdaquin - ICCBR 2022 - 14/09/2022
R2
Used cars/Toyota

@mdaquin - ICCBR 2022 - 14/09/2022
Best results for students/maths
Best results for airfoil

@mdaquin - ICCBR 2022 - 14/09/2022
Best results for ﬂights/economy
Best results for ﬂights/business

@mdaquin - ICCBR 2022 - 14/09/2022
Number of cases used

@mdaquin - ICCBR 2022 - 14/09/2022
Summary and the use of context
Minimal
settings to
reach within
0.2% of the
best result

@mdaquin - ICCBR 2022 - 14/09/2022
Conclusion
In all cases, obtained at least similar peak performance.
On average, requires more or less the same amount of data.
But easier to converge when learning from differences.
Adding context is not always beneficial, but is never significantly
detrimental.
The ideal number of cases used during training and during
retrieval depends on the specific scenario.
https://github.com/mdaquin/deltaML

@mdaquin - ICCBR 2022 - 14/09/2022
Limitations
The networks tested are similar to each other and relatively
simple.
Faster to converge does not mean less time required: Finding the
most similar cases to build the difference-based training set can
take a lot of time
→ needs a faster, more scalable KNN implementation
Not tested on classification tasks, or tasks where the differences
might rather be computed on the “representation layer”
(embeddings, CNN) than on the raw data.

@mdaquin - ICCBR 2022 - 14/09/2022
mathieu.daquin@loria.fr
@mdaquin
k.loria.fr

@mdaquin - ICCBR 2022 - 14/09/2022
Target problem
Target solution?
Source problem
Source solution
retrieval
adaptation

A factorial study of neural network learning from differences for regression

More Related Content

Similar to A factorial study of neural network learning from differences for regression (20)

More from Mathieu d'Aquin (20)

Recently uploaded (20)

A factorial study of neural network learning from differences for regression