TUKE System for MediaEval 2014 QUESST

TECHNICAL
UNIVERSITY
OF KOŠICE
Laboratory of Speech
Technologies in
Telecommunications
TUKE system for MediaEval 2014 QUESST
Jozef VAVREK, Peter VISZLAY, Martin LOJKA, Matúš PLEVA, and Jozef JUHÁR
Department of Electronics and Multimedia Communications
Technical University of Košice, Slovak Republic
{Jozef.Vavrek, Peter.Viszlay, Martin.Lojka, Matus.Pleva, Jozef.Juhar}@tuke.sk
Zero-resource approaches
Timit
ParDat1
SpeechDat CZ
SpeechDat SK
Tab.1 Evaluation of primary low-resource (p-low) and general
zero resource (g-zero) systems (* indicates late submission)
dev
Cnxe TWV
(act/min) (act/min)
0.161/0.162
0.091/0.091
0.191/0.191
0.106/0.107
Acknowledgments
eval
(act/min) (act/min)
0.959/0.891
0.973/0.934
0.947/0.853
0.970/0.921
0.154/0.154
0.075/0.077
0.168/0.169
0.102/0.103
0.960/0.892
0.974/0.934
0.948/0.854
0.971/0.922
Tab.2 Processing resources measures
p-low*
g-zero*
Searching Algorithm (Weighted Fast Sequential - DTW):
1) one step forward moving strategy, when each DTW search is carried out
sequentially, block by block, with size equal to the length of query;
2) linear time-aligned accumulated distance for speeding up sequential DTW
without considerable loss in retrieving performance;
3) optimization of global minimum for set of alignment paths by
implementing weighted cumulative distance (WCD) parameter.
500 detected
candidates
This publication is the result of the Project implementation: University Science Park TECHNICOM for Innovation Applications Supported by
Knowledge Technology, ITMS: 26220220182, supported by the Research & Development Operational Programme funded by the ERDF (100%).
MediaEval 2014: Query by Example Search on Speech Task, 16-17 October 2014, Barcelona, Spain
VAD
PCA-based
Posteriorgrams
GMM-based
Query
Abstract
Two approaches to QbE (Query-by-Example) retrieving system, proposed by the Technical University of Košice (TUKE) for the query
by example search on speech task (QUESST), are presented in this paper. Our main interest was focused on building such QbE
system, which is able to retrieve all given queries with and without using any external speech resources.Therefore we developed
posteriorgram-based keyword matching system, which utilizes a novel weighted fast sequential variant of DTW (WFS-DTW) algorithm
in order to detect occurrences of each query within the particular utterance file, using two GMM-based acoustic units modeling
approaches. The first one, referred as low-resource approach, employs language-dependent phonetic decoders to convert queries
and utterances into posteriorgrams. The second one, defined as zero-resource approach, implements combination of unsupervised
segmentation and clustering techniques by using only provided utterance files.
Results
System Overview
system Cnxe TWV
p-low
g-zero
system ISF SSF PMUI PMUS PL
p-low (dev) 0.61 0.0034 0.05 2.46 0.010
g-zero (dev) 1.50 0.0042 1.40 3.92 0.225
Conclusions and Future Work
Phonetic
Decoders
Utterances
Type 1
- 13 MFCC
- PCA-based feature selection
- K-means clustering (K=75)
Type 2
- 39 MFCC
- Type 1 + Viterbi seg.
& new GMM training
Type 3
- 39 MFCC
- flat start training (GMM-based)
- phone sequences from Type 1
Type 4
- 39 MFCC
- GMM-based seg. (64 GM)
- EHHM (64 states / 256 GM)
Low-resource approaches
WFS-DTW
Score normalization & Fusion
- scaling 0-1
- max-score merging fusion
- z-normalization
- Still big differences in performance between p-low and g-zero
approaches, even if the score fusion technique was
applied.
- There is also considerable gap between act and min Cnxe
despite the fact that the act and max TWV are perfectly
calibrated.
- An improved calibration/fusion models based on affine
transformation and linear-regression will be investigated in
the future.
The indexing was done using 2xIBM x3650 (Intel E5530 @ 2.4 GHz, 8
cores), 28 GB RAM, under Debian OS. Searching algorithm was running
on 52xIBM dx360 M3 cluster (Intel E5645 @ 2.4GHz, 624 cores), 48 GB
RAM per node, running on Scientific Linux 6 and Torque 2.5.13.
time-aligned
& labelled
segments
new GMM
training

TUKE System for MediaEval 2014 QUESST

More Related Content

What's hot (20)

Viewers also liked (11)

Similar to TUKE System for MediaEval 2014 QUESST (20)

More from multimediaeval (20)

Recently uploaded (20)

TUKE System for MediaEval 2014 QUESST