SlideShare ist ein Scribd-Unternehmen logo
Data Mining In Time Series Databases Mark Last
Abraham Kandel download
https://ebookbell.com/product/data-mining-in-time-series-
databases-mark-last-abraham-kandel-918848
Explore and download more ebooks at ebookbell.com
Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Data Mining In The Dark Darknet Intelligence Automation Brian Nafziger
https://ebookbell.com/product/data-mining-in-the-dark-darknet-
intelligence-automation-brian-nafziger-49473646
Data Mining In Grid Computing Environments 1st Edition Werner Dubitzky
Editor
https://ebookbell.com/product/data-mining-in-grid-computing-
environments-1st-edition-werner-dubitzky-editor-2385300
Data Mining In Proteomics From Standards To Applications 1st Edition
Michael Hamacher
https://ebookbell.com/product/data-mining-in-proteomics-from-
standards-to-applications-1st-edition-michael-hamacher-2448462
Data Mining In Finance Advances In Relational And Hybrid Methods 1st
Edition Boris Kovalerchuk
https://ebookbell.com/product/data-mining-in-finance-advances-in-
relational-and-hybrid-methods-1st-edition-boris-kovalerchuk-4199788
Data Mining In Crystallography 1st Edition Joannis Apostolakis Auth
https://ebookbell.com/product/data-mining-in-crystallography-1st-
edition-joannis-apostolakis-auth-4205450
Data Mining In Bioinformatics 1st Edition Jason T L Wang Mohammed J
Zaki
https://ebookbell.com/product/data-mining-in-bioinformatics-1st-
edition-jason-t-l-wang-mohammed-j-zaki-4238968
Data Mining In Large Sets Of Complex Data 1st Edition Robson L F
Cordeiro
https://ebookbell.com/product/data-mining-in-large-sets-of-complex-
data-1st-edition-robson-l-f-cordeiro-4241700
Data Mining In Structural Biology Signal Transduction And Beyond 1st
Edition Ch Heldin Auth
https://ebookbell.com/product/data-mining-in-structural-biology-
signal-transduction-and-beyond-1st-edition-ch-heldin-auth-4284820
Data Mining In Biomedical Imaging Signaling And Systems Sumeet Dua Ed
https://ebookbell.com/product/data-mining-in-biomedical-imaging-
signaling-and-systems-sumeet-dua-ed-4421560
Data Mining In Time Series Databases Mark Last Abraham Kandel
Data Mining In Time Series Databases Mark Last Abraham Kandel
DATA MINING IN
TIME SERIES DATABASES
SERIES IN MACHINE PERCEPTION AND ARTIFICIAL INTELLIGENCE*
Editors: H. Bunke (Univ. Bern, Switzerland)
P. S. P. Wang (Northeastern Univ., USA)
Vol. 43: Agent Engineering
(Eds. Jiming Liu, Ning Zhong, Yuan Y. Tang and Patrick S. P. Wang)
Vol. 44: Multispectral Image Processing and Pattern Recognition
(Eds. J. Shen, P. S. P. Wang and T. Zhang)
Vol. 45: Hidden Markov Models: Applications in Computer Vision
(Eds. H. Bunke and T. Caelli)
Vol. 46: Syntactic Pattern Recognition for Seismic Oil Exploration
(K. Y. Huang)
Vol. 47: Hybrid Methods in Pattern Recognition
(Eds. H. Bunke and A. Kandel)
Vol. 48: Multimodal Interface for Human-Machine Communications
(Eds. P. C. Yuen, Y. Y. Tang and P. S. P. Wang)
Vol. 49: Neural Networks and Systolic Array Design
(Eds. D. Zhang and S. K. Pal)
Vol. 50: Empirical Evaluation Methods in Computer Vision
(Eds. H. I. Christensen and P. J. Phillips)
Vol. 51: Automatic Diatom Identification
(Eds. H. du Buf and M. M. Bayer)
Vol. 52: Advances in Image Processing and Understanding
A Festschrift for Thomas S. Huwang
(Eds. A. C. Bovik, C. W. Chen and D. Goldgof)
Vol. 53: Soft Computing Approach to Pattern Recognition and Image Processing
(Eds. A. Ghosh and S. K. Pal)
Vol. 54: Fundamentals of Robotics — Linking Perception to Action
(M. Xie)
Vol. 55: Web Document Analysis: Challenges and Opportunities
(Eds. A. Antonacopoulos and J. Hu)
Vol. 56: Artificial Intelligence Methods in Software Testing
(Eds. M. Last, A. Kandel and H. Bunke)
Vol. 57: Data Mining in Time Series Databases
(Eds. M. Last, A. Kandel and H. Bunke)
Vol. 58: Computational Web Intelligence: Intelligent Technology for
Web Applications
(Eds. Y. Zhang, A. Kandel, T. Y. Lin and Y. Yao)
Vol. 59: Fuzzy Neural Network Theory and Application
(P. Liu and H. Li)
*For the complete list of titles in this series, please write to the Publisher.
Series in Machine Perception and Artificial Intelligence -Vol, 57
DATA MINING IN
TIME SERIES DATABASES
Editors
Mark Last
Ben-Gurion LIniversity o
f the Negeu,Israel
Abraham Kandel
Zl-Auiv University, Israel
University of South Florida, Tampa, LISA
Horst Bunke
University of Bern, Switzerland
vpWorld Scientific
N E W J E R S E Y * L O N D O N * S I N G A P O R E B E l J l N G S H A N G H A I H O N G K O N G TAIPEI C H E N N A I
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
ISBN 981-238-290-9
Typeset by Stallion Press
Email: enquiries@stallionpress.com
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.
Copyright © 2004 by World Scientific Publishing Co. Pte. Ltd.
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Printed in Singapore by World Scientific Printers (S) Pte Ltd
DATA MINING IN TIME SERIES DATABASES
Series in Machine Perception and Artificial Intelligence (Vol. 57)
Dedicated to
The Honorable Congressman C. W. Bill Young
House of Representatives
For his vision and continuous support in creating the National Institute
for Systems Test and Productivity at the Computer Science and
Engineering Department, University of South Florida
This page intentionally left blank
Preface
Traditional data mining methods are designed to deal with “static”
databases, i.e. databases where the ordering of records (or other database
objects) has nothing to do with the patterns of interest. Though the assump-
tion of order irrelevance may be sufficiently accurate in some applications,
there are certainly many other cases, where sequential information, such as
a time-stamp associated with every record, can significantly enhance our
knowledge about the mined data. One example is a series of stock values:
a specific closing price recorded yesterday has a completely different mean-
ing than the same value a year ago. Since most today’s databases already
include temporal data in the form of “date created”, “date modified”, and
other time-related fields, the only problem is how to exploit this valuable
information to our benefit. In other words, the question we are currently
facing is: How to mine time series data?
The purpose of this volume is to present some recent advances in pre-
processing, mining, and interpretation of temporal data that is stored by
modern information systems. Adding the time dimension to a database
produces a Time Series Database (TSDB) and introduces new aspects and
challenges to the tasks of data mining and knowledge discovery. These new
challenges include: finding the most efficient representation of time series
data, measuring similarity of time series, detecting change points in time
series, and time series classification and clustering. Some of these problems
have been treated in the past by experts in time series analysis. However,
statistical methods of time series analysis are focused on sequences of values
representing a single numeric variable (e.g., price of a specific stock). In a
real-world database, a time-stamped record may include several numerical
and nominal attributes, which may depend not only on the time dimension
but also on each other. To make the data mining task even more com-
plicated, the objects in a time series may represent some complex graph
structures rather than vectors of feature-values.
vii
viii Preface
Our book covers the state-of-the-art research in several areas of time
series data mining. Specific problems challenged by the authors of this
volume are as follows.
Representation of Time Series. Efficient and effective representation
of time series is a key to successful discovery of time-related patterns.
The most frequently used representation of single-variable time series is
piecewise linear approximation, where the original points are reduced to
a set of straight lines (“segments”). Chapter 1 by Eamonn Keogh, Selina
Chu, David Hart, and Michael Pazzani provides an extensive and compar-
ative overview of existing techniques for time series segmentation. In the
view of shortcomings of existing approaches, the same chapter introduces
an improved segmentation algorithm called SWAB (Sliding Window and
Bottom-up).
Indexing and Retrieval of Time Series. Since each time series is char-
acterized by a large, potentially unlimited number of points, finding two
identical time series for any phenomenon is hopeless. Thus, researchers have
been looking for sets of similar data sequences that differ only slightly from
each other. The problem of retrieving similar series arises in many areas such
as marketing and stock data analysis, meteorological studies, and medical
diagnosis. An overview of current methods for efficient retrieval of time
series is presented in Chapter 2 by Magnus Lie Hetland. Chapter 3 (by
Eugene Fink and Kevin B. Pratt) presents a new method for fast compres-
sion and indexing of time series. A robust similarity measure for retrieval of
noisy time series is described and evaluated by Michail Vlachos, Dimitrios
Gunopulos, and Gautam Das in Chapter 4.
Change Detection in Time Series. The problem of change point detec-
tion in a sequence of values has been studied in the past, especially in the
context of time series segmentation (see above). However, the nature of
real-world time series may be much more complex, involving multivariate
and even graph data. Chapter 5 (by Gil Zeira, Oded Maimon, Mark Last,
and Lior Rokach) covers the problem of change detection in a classification
model induced by a data mining algorithm from time series data. A change
detection procedure for detecting abnormal events in time series of graphs
is presented by Horst Bunke and Miro Kraetzl in Chapter 6. The procedure
is applied to abnormal event detection in a computer network.
Classification of Time Series. Rather than partitioning a time series
into segments, one can see each time series, or any other sequence of data
points, as a single object. Classification and clustering of such complex
Preface ix
“objects” may be particularly beneficial for the areas of process con-
trol, intrusion detection, and character recognition. In Chapter 7, Carlos
J. Alonso González and Juan J. Rodrı́guez Diez present a new method for
early classification of multivariate time series. Their method is capable of
learning from series of variable length and able of providing a classification
when only part of the series is presented to the classifier. A novel concept of
representing time series by median strings (see Chapter 8, by Xiaoyi Jiang,
Horst Bunke, and Janos Csirik) opens new opportunities for applying clas-
sification and clustering methods of data mining to sequential data.
As indicated above, the area of mining time series databases still
includes many unexplored and insufficiently explored issues. Specific sug-
gestions for future research can be found in individual chapters. In general,
we believe that interesting and useful results can be obtained by applying
the methods described in this book to real-world sets of sequential data.
Acknowledgments
The preparation of this volume was partially supported by the National
Institute for Systems Test and Productivity at the University of South
Florida under U.S. Space and Naval Warfare Systems Command grant num-
ber N00039-01-1-2248.
We also would like to acknowledge the generous support and cooperation
of: Ben-Gurion University of the Negev, Department of Information Sys-
tems Engineering, University of South Florida, Department of Computer
Science and Engineering, Tel-Aviv University, College of Engineering, The
Fulbright Foundation, The US-Israel Educational Foundation.
January 2004 Mark Last
Abraham Kandel
Horst Bunke
This page intentionally left blank
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Chapter 1 Segmenting Time Series: A Survey
and Novel Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
E. Keogh, S. Chu, D. Hart and M. Pazzani
Chapter 2 A Survey of Recent Methods for Efficient
Retrieval of Similar Time Sequences. . . . . . . . . . . . . . . . 23
M. L. Hetland
Chapter 3 Indexing of Compressed Time Series . . . . . . . . . . . . . . . 43
E. Fink and K. B. Pratt
Chapter 4 Indexing Time-Series under Conditions of Noise. . . . 67
M. Vlachos, D. Gunopulos and G. Das
Chapter 5 Change Detection in Classification Models
Induced from Time Series Data . . . . . . . . . . . . . . . . . . . . 101
G. Zeira, O. Maimon, M. Last and L. Rokach
Chapter 6 Classification and Detection of
Abnormal Events in Time Series of Graphs. . . . . . . . .127
H. Bunke and M. Kraetzl
Chapter 7 Boosting Interval-Based Literals:
Variable Length and Early Classification . . . . . . . . . . . 149
C. J. Alonso González and J. J. Rodrı́guez Diez
Chapter 8 Median Strings: A Review . . . . . . . . . . . . . . . . . . . . . . . . .173
X. Jiang, H. Bunke and J. Csirik
xi
This page intentionally left blank
CHAPTER 1
SEGMENTING TIME SERIES: A SURVEY AND
NOVEL APPROACH
Eamonn Keogh
Computer Science & Engineering Department, University of California —
Riverside, Riverside, California 92521, USA
E-mail: eamonn@cs.ucr.edu
Selina Chu, David Hart, and Michael Pazzani
Department of Information and Computer Science, University of California,
Irvine, California 92697, USA
E-mail: {selina, dhart, pazzani}@ics.uci.edu
In recent years, there has been an explosion of interest in mining time
series databases. As with most computer science problems, representa-
tion of the data is the key to efficient and effective solutions. One of the
most commonly used representations is piecewise linear approximation.
This representation has been used by various researchers to support clus-
tering, classification, indexing and association rule mining of time series
data. A variety of algorithms have been proposed to obtain this represen-
tation, with several algorithms having been independently rediscovered
several times. In this chapter, we undertake the first extensive review
and empirical comparison of all proposed techniques. We show that all
these algorithms have fatal flaws from a data mining perspective. We
introduce a novel algorithm that we empirically show to be superior to
all others in the literature.
Keywords: Time series; data mining; piecewise linear approximation;
segmentation; regression.
1. Introduction
In recent years, there has been an explosion of interest in mining time
series databases. As with most computer science problems, representation
of the data is the key to efficient and effective solutions. Several high level
1
2 E. Keogh, S. Chu, D. Hart and M. Pazzani
(a) (b)
Fig. 1. Two time series and their piecewise linear representation. (a) Space Shuttle
Telemetry. (b) Electrocardiogram (ECG).
representations of time series have been proposed, including Fourier Trans-
forms [Agrawal et al. (1993), Keogh et al. (2000)], Wavelets [Chan and Fu
(1999)], Symbolic Mappings [Agrawal et al. (1995), Das et al. (1998), Perng
et al. (2000)] and Piecewise Linear Representation (PLR). In this work,
we confine our attention to PLR, perhaps the most frequently used repre-
sentation [Ge and Smyth (2001), Last et al. (2001), Hunter and McIntosh
(1999), Koski et al. (1995), Keogh and Pazzani (1998), Keogh and Pazzani
(1999), Keogh and Smyth (1997), Lavrenko et al. (2000), Li et al. (1998),
Osaki et al. (1999), Park et al. (2001), Park et al. (1999), Qu et al. (1998),
Shatkay (1995), Shatkay and Zdonik (1996), Vullings et al. (1997), Wang
and Wang (2000)].
Intuitively, Piecewise Linear Representation refers to the approximation
of a time series T, of length n, with K straight lines (hereafter known as
segments). Figure 1 contains two examples. Because K is typically much
smaller that n, this representation makes the storage, transmission and
computation of the data more efficient. Specifically, in the context of data
mining, the piecewise linear representation has been used to:
• Support fast exact similarly search [Keogh et al. (2000)].
• Support novel distance measures for time series, including “fuzzy queries”
[Shatkay (1995), Shatkay and Zdonik (1996)], weighted queries [Keogh
and Pazzani (1998)], multiresolution queries [Wang and Wang (2000),
Li et al. (1998)], dynamic time warping [Park et al. (1999)] and relevance
feedback [Keogh and Pazzani (1999)].
• Support concurrent mining of text and time series [Lavrenko et al.
(2000)].
• Support novel clustering and classification algorithms [Keogh and
Pazzani (1998)].
• Support change point detection [Sugiura and Ogden (1994), Ge and
Smyth (2001)].
Segmenting Time Series: A Survey and Novel Approach 3
Surprisingly, in spite of the ubiquity of this representation, with the
exception of [Shatkay (1995)], there has been little attempt to understand
and compare the algorithms that produce it. Indeed, there does not even
appear to be a consensus on what to call such an algorithm. For clarity, we
will refer to these types of algorithm, which input a time series and return
a piecewise linear representation, as segmentation algorithms.
The segmentation problem can be framed in several ways.
• Given a time series T, produce the best representation using only K
segments.
• Given a time series T, produce the best representation such that the maxi-
mum error for any segment does not exceed some user-specified threshold,
max error.
• Given a time series T, produce the best representation such that the
combined error of all segments is less than some user-specified threshold,
total max error.
As we shall see in later sections, not all algorithms can support all these
specifications.
Segmentation algorithms can also be classified as batch or online. This is
an important distinction because many data mining problems are inherently
dynamic [Vullings et al. (1997), Koski et al. (1995)].
Data mining researchers, who needed to produce a piecewise linear
approximation, have typically either independently rediscovered an algo-
rithm or used an approach suggested in related literature. For example,
from the fields of cartography or computer graphics [Douglas and Peucker
(1973), Heckbert and Garland (1997), Ramer (1972)].
In this chapter, we review the three major segmentation approaches
in the literature and provide an extensive empirical evaluation on a very
heterogeneous collection of datasets from finance, medicine, manufacturing
and science. The major result of these experiments is that only online algo-
rithm in the literature produces very poor approximations of the data, and
that the only algorithm that consistently produces high quality results and
scales linearly in the size of the data is a batch algorithm. These results
motivated us to introduce a new online algorithm that scales linearly in the
size of the data set, is online, and produces high quality approximations.
The rest of the chapter is organized as follows. In Section 2, we provide
an extensive review of the algorithms in the literature. We explain the basic
approaches, and the various modifications and extensions by data miners. In
Section 3, we provide a detailed empirical comparison of all the algorithms.
4 E. Keogh, S. Chu, D. Hart and M. Pazzani
We will show that the most popular algorithms used by data miners can in
fact produce very poor approximations of the data. The results will be used
to motivate the need for a new algorithm that we will introduce and validate
in Section 4. Section 5 offers conclusions and directions for future work.
2. Background and Related Work
In this section, we describe the three major approaches to time series seg-
mentation in detail. Almost all the algorithms have 2 and 3 dimensional
analogues, which ironically seem to be better understood. A discussion of
the higher dimensional cases is beyond the scope of this chapter. We refer
the interested reader to [Heckbert and Garland (1997)], which contains an
excellent survey.
Although appearing under different names and with slightly different
implementation details, most time series segmentation algorithms can be
grouped into one of the following three categories:
• Sliding Windows: A segment is grown until it exceeds some error bound.
The process repeats with the next data point not included in the newly
approximated segment.
• Top-Down: The time series is recursively partitioned until some stopping
criteria is met.
• Bottom-Up: Starting from the finest possible approximation, segments
are merged until some stopping criteria is met.
Table 1 contains the notation used in this chapter.
Table 1. Notation.
T A time series in the form t1, t2, . . . , tn
T[a : b] The subsection of T from a to b, ta, ta+1, . . . , tb
Seg TS A piecewise linear approximation of a time series of length n
with K segments. Individual segments can be addressed with
Seg TS(i).
create segment(T) A function that takes in a time series and returns a linear segment
approximation of it.
calculate error(T) A function that takes in a time series and returns the
approximation error of the linear segment approximation of it.
Given that we are going to approximate a time series with straight lines,
there are at least two ways we can find the approximating line.
Segmenting Time Series: A Survey and Novel Approach 5
• Linear Interpolation: Here the approximating line for the subsequence
T[a : b] is simply the line connecting ta and tb. This can be obtained in
constant time.
• Linear Regression: Here the approximating line for the subsequence
T[a : b] is taken to be the best fitting line in the least squares sense
[Shatkay (1995)]. This can be obtained in time linear in the length of
segment.
The two techniques are illustrated in Figure 2. Linear interpolation
tends to closely align the endpoint of consecutive segments, giving the piece-
wise approximation a “smooth” look. In contrast, piecewise linear regression
can produce a very disjointed look on some datasets. The aesthetic superi-
ority of linear interpolation, together with its low computational complex-
ity has made it the technique of choice in computer graphic applications
[Heckbert and Garland (1997)]. However, the quality of the approximating
line, in terms of Euclidean distance, is generally inferior to the regression
approach.
In this chapter, we deliberately keep our descriptions of algorithms at a
high level, so that either technique can be imagined as the approximation
technique. In particular, the pseudocode function create segment(T) can
be imagined as using interpolation, regression or any other technique.
All segmentation algorithms also need some method to evaluate the
quality of fit for a potential segment. A measure commonly used in conjunc-
tion with linear regression is the sum of squares, or the residual error. This is
calculated by taking all the vertical differences between the best-fit line and
the actual data points, squaring them and then summing them together.
Another commonly used measure of goodness of fit is the distance between
the best fit line and the data point furthest away in the vertical direction
Linear
Interpolation
Linear
Regression
Fig. 2. Two 10-segment approximations of electrocardiogram data. The approxima-
tion created using linear interpolation has a smooth aesthetically appealing appearance
because all the endpoints of the segments are aligned. Linear regression, in contrast, pro-
duces a slightly disjointed appearance but a tighter approximation in terms of residual
error.
6 E. Keogh, S. Chu, D. Hart and M. Pazzani
(i.e. the L∞ norm between the line and the data). As before, we have
kept our descriptions of the algorithms general enough to encompass any
error measure. In particular, the pseudocode function calculate error(T)
can be imagined as using any sum of squares, furthest point, or any other
measure.
2.1. The Sliding Window Algorithm
The Sliding Window algorithm works by anchoring the left point of a poten-
tial segment at the first data point of a time series, then attempting to
approximate the data to the right with increasing longer segments. At some
point i, the error for the potential segment is greater than the user-specified
threshold, so the subsequence from the anchor to i − 1 is transformed into
a segment. The anchor is moved to location i, and the process repeats until
the entire time series has been transformed into a piecewise linear approx-
imation. The pseudocode for the algorithm is shown in Table 2.
The Sliding Window algorithm is attractive because of its great sim-
plicity, intuitiveness and particularly the fact that it is an online algorithm.
Several variations and optimizations of the basic algorithm have been pro-
posed. Koski et al. noted that on ECG data it is possible to speed up the
algorithm by incrementing the variable i by “leaps of length k” instead of
1. For k = 15 (at 400 Hz), the algorithm is 15 times faster with little effect
on the output accuracy [Koski et al. (1995)].
Depending on the error measure used, there may be other optimizations
possible. Vullings et al. noted that since the residual error is monotonically
non-decreasing with the addition of more data points, one does not have
to test every value of i from 2 to the final chosen value [Vullings et al.
(1997)]. They suggest initially setting i to s, where s is the mean length
of the previous segments. If the guess was pessimistic (the measured error
Table 2. The generic Sliding Window algorithm.
Algorithm
Algorithm
Algorithm Seg TS = Sliding Window(T, max error)
anchor = 1;
while not finished segmenting time series
while not finished segmenting time series
while not finished segmenting time series
i = 2;
while
while
while calculate error(T[anchor: anchor + i ]) < max error
i = i + 1;
end;
end;
end;
Seg TS = concat(Seg TS, create segment(T[anchor: anchor
+ (i - 1)]);anchor = anchor + i;
end;
end;
end;
Segmenting Time Series: A Survey and Novel Approach 7
is still less than max error) then the algorithm continues to increment i
as in the classic algorithm. Otherwise they begin to decrement i until the
measured error is less than max error. This optimization can greatly speed
up the algorithm if the mean length of segments is large in relation to
the standard deviation of their length. The monotonically non-decreasing
property of residual error also allows binary search for the length of the
segment. Surprisingly, no one we are aware of has suggested this.
The Sliding Window algorithm can give pathologically poor results
under some circumstances, particularly if the time series in question con-
tains abrupt level changes. Most researchers have not reported this [Qu
et al. (1998), Wang and Wang (2000)], perhaps because they tested the
algorithm on stock market data, and its relative performance is best on
noisy data. Shatkay (1995), in contrast, does notice the problem and gives
elegant examples and explanations [Shatkay (1995)]. They consider three
variants of the basic algorithm, each designed to be robust to a certain
case, but they underline the difficulty of producing a single variant of the
algorithm that is robust to arbitrary data sources.
Park et al. (2001) suggested modifying the algorithm to create “mono-
tonically changing” segments [Park et al. (2001)]. That is, all segments con-
sist of data points of the form of t1 ≤ t2 ≤ · · · ≤ tn or t1 ≥ t2 ≥ · · · ≥ tn.
This modification worked well on the smooth synthetic dataset it was
demonstrated on. But on real world datasets with any amount of noise,
the approximation is greatly overfragmented.
Variations on the Sliding Window algorithm are particularly popular
with the medical community (where it is known as FAN or SAPA), since
patient monitoring is inherently an online task [Ishijima et al. (1983), Koski
et al. (1995), McKee et al. (1994), Vullings et al. (1997)].
2.2. The Top-Down Algorithm
The Top-Down algorithm works by considering every possible partitioning
of the times series and splitting it at the best location. Both subsections
are then tested to see if their approximation error is below some user-
specified threshold. If not, the algorithm recursively continues to split the
subsequences until all the segments have approximation errors below the
threshold. The pseudocode for the algorithm is shown in Table 3.
Variations on the Top-Down algorithm (including the 2-dimensional
case) were independently introduced in several fields in the early 1970’s.
In cartography, it is known as the Douglas-Peucker algorithm [Douglas and
8 E. Keogh, S. Chu, D. Hart and M. Pazzani
Table 3. The generic Top-Down algorithm.
Algorithm
Algorithm
Algorithm Seg TS = Top Down(T, max error)
best so far = inf;
for
for
for i = 2 to
to
to length(T) - 2 // Find the best splitting point.
improvement in approximation = improvement splitting here(T, i);
if
if
if improvement in approximation < best so far
breakpoint = i;
best so far = improvement in approximation;
end;
end;
end;
end;
end;
end;
// Recursively split the left segment if necessary.
if
if
if calculate error(T[1:breakpoint]) > max error
Seg TS = Top Down(T[1:breakpoint]);
end;
end;
end;
// Recursively split the right segment if necessary.
if
if
if calculate error(T[breakpoint + 1:length(T)]) > max error
Seg TS = Top Down(T[breakpoint + 1:length(T)]);
end;
end;
end;
Peucker (1973)]; in image processing, it is known as Ramer’s algorithm
[Ramer (1972)]. Most researchers in the machine learning/data mining com-
munity are introduced to the algorithm in the classic textbook by Duda and
Harts, which calls it “Iterative End-Points Fits” [Duda and Hart (1973)].
In the data mining community, the algorithm has been used by [Li et al.
(1998)] to support a framework for mining sequence databases at multiple
abstraction levels. Shatkay and Zdonik use it (after considering alternatives
such as Sliding Windows) to support approximate queries in time series
databases [Shatkay and Zdonik (1996)].
Park et al. introduced a modification where they first perform a scan
over the entire dataset marking every peak and valley [Park et al. (1999)].
These extreme points used to create an initial segmentation, and the Top-
Down algorithm is applied to each of the segments (in case the error on an
individual segment was still too high). They then use the segmentation to
support a special case of dynamic time warping. This modification worked
well on the smooth synthetic dataset it was demonstrated on. But on real
world data sets with any amount of noise, the approximation is greatly
overfragmented.
Lavrenko et al. uses the Top-Down algorithm to support the concurrent
mining of text and time series [Lavrenko et al. (2000)]. They attempt to
discover the influence of news stories on financial markets. Their algorithm
contains some interesting modifications including a novel stopping criteria
based on the t-test.
Segmenting Time Series: A Survey and Novel Approach 9
Finally Smyth and Ge use the algorithm to produce a representation
that can support a Hidden Markov Model approach to both change point
detection and pattern matching [Ge and Smyth (2001)].
2.3. The Bottom-Up Algorithm
The Bottom-Up algorithm is the natural complement to the Top-Down
algorithm. The algorithm begins by creating the finest possible approxima-
tion of the time series, so that n/2 segments are used to approximate the n-
length time series. Next, the cost of merging each pair of adjacent segments
is calculated, and the algorithm begins to iteratively merge the lowest cost
pair until a stopping criteria is met. When the pair of adjacent segments i
and i + 1 are merged, the algorithm needs to perform some bookkeeping.
First, the cost of merging the new segment with its right neighbor must be
calculated. In addition, the cost of merging the i − 1 segments with its new
larger neighbor must be recalculated. The pseudocode for the algorithm is
shown in Table 4.
Two and three-dimensional analogues of this algorithm are common in
the field of computer graphics where they are called decimation methods
[Heckbert and Garland (1997)]. In data mining, the algorithm has been
used extensively by two of the current authors to support a variety of time
series data mining tasks [Keogh and Pazzani (1999), Keogh and Pazzani
(1998), Keogh and Smyth (1997)]. In medicine, the algorithm was used
by Hunter and McIntosh to provide the high level representation for their
medical pattern matching system [Hunter and McIntosh (1999)].
Table 4. The generic Bottom-Up algorithm.
Algorithm
Algorithm
Algorithm Seg TS = Bottom Up(T, max error)
for
for
for i = 1 : 2 : length(T) // Create initial fine approximation.
Seg TS = concat(Seg TS, create segment(T[i: i + 1 ]));
end;
end;
end;
for
for
for i = 1 : length(Seg TS) - 1 // Find merging costs.
merge cost(i) = calculate error([merge(Seg TS(i), Seg TS(i + 1))]);
end;
end;
end;
while
while
while min(merge cost) < max error // While not finished.
p = min(merge cost); // Find ‘‘cheapest’’ pair to merge.
Seg TS(p) = merge(Seg TS(p), Seg TS(p + 1)); // Merge them.
delete(Seg TS(p + 1)); // Update records.
merge cost(p) = calculate error(merge(Seg TS(p), Seg TS(p + 1)));
merge cost(p - 1) = calculate error(merge(Seg TS(p - 1), Seg TS(p)));
end;
end;
end;
10 E. Keogh, S. Chu, D. Hart and M. Pazzani
2.4. Feature Comparison of the Major Algorithms
We have deliberately deferred the discussion of the running times of the
algorithms until now, when the reader’s intuition for the various approaches
are more developed. The running time for each approach is data dependent.
For that reason, we discuss both a worst-case time that gives an upper
bound and a best-case time that gives a lower bound for each approach.
We use the standard notation of Ω(f(n)) for a lower bound, O(f(n)) for
an upper bound, and θ(f(n)) for a function that is both a lower and upper
bound.
Definitions and Assumptions. The number of data points is n, the
number of segments we plan to create is K, and thus the average segment
length is L = n/K. The actual length of segments created by an algorithm
varies and we will refer to the lengths as Li.
All algorithms, except top-down, perform considerably worse if we allow
any of the LI to become very large (say n/4), so we assume that the algo-
rithms limit the maximum length L to some multiple of the average length.
It is trivial to code the algorithms to enforce this, so the time analysis that
follows is exact when the algorithm includes this limit. Empirical results
show, however, that the segments generated (with no limit on length) are
tightly clustered around the average length, so this limit has little effect in
practice.
We assume that for each set S of points, we compute a best segment
and compute the error in θ(n) time. This reflects the way these algorithms
are coded in practice, which is to use a packaged algorithm or function to
do linear regression. We note, however, that we believe one can produce
asymptotically faster algorithms if one custom codes linear regression (or
other best fit algorithms) to reuse computed values so that the computation
is done in less than O(n) time in subsequent steps. We leave that as a topic
for future work. In what follows, all computations of best segment and error
are assumed to be θ(n).
Top-Down. The best time for Top-Down occurs if each split occurs at
the midpoint of the data. The first iteration computes, for each split point
i, the best line for points [1, i] and for points [i + 1, n]. This takes θ(n) for
each split point, or θ(n2
) total for all split points. The next iteration finds
split points for [1, n/2] and for [n/2 + 1, n]. This gives a recurrence T(n) =
2T(n/2) + θ(n2
) where we have T(2) = c, and this solves to T(n) = Ω(n2
).
This is a lower bound because we assumed the data has the best possible
split points.
Segmenting Time Series: A Survey and Novel Approach 11
The worst time occurs if the computed split point is always at one side
(leaving just 2 points on one side), rather than the middle. The recurrence
is T(n) = T(n − 2) + θ(n2
) We must stop after K iterations, giving a time
of O(n2
K).
Sliding Windows. For this algorithm, we compute best segments for
larger and larger windows, going from 2 up to at most cL (by the assumption
we discussed above). The maximum time to compute a single segment is
cL
i=2 θ(i) = θ(L2
). The number of segments can be as few as n/cL = K/c
or as many as K. The time is thus θ(L2
K) or θ(Ln). This is both a best
case and worst case bound.
Bottom-Up. The first iteration computes the segment through each
pair of points and the costs of merging adjacent segments. This is easily
seen to take O(n) time. In the following iterations, we look up the minimum
error pair i and i + 1 to merge; merge the pair into a new segment Snew;
delete from a heap (keeping track of costs is best done with a heap) the
costs of merging segments i−1 and i and merging segments i+1 and i+2;
compute the costs of merging Snew with Si−1 and with Si−2; and insert
these costs into our heap of costs. The time to look up the best cost is θ(1)
and the time to add and delete costs from the heap is O(log n). (The time
to construct the heap is O(n).)
In the best case, the merged segments always have about equal length,
and the final segments have length L. The time to merge a set of length 2
segments, which will end up being one length L segment, into half as many
segments is θ(L) (for the time to compute the best segment for every pair
of merged segments), not counting heap operations. Each iteration takes
the same time repeating θ(log L) times gives a segment of size L.
The number of times we produce length L segments is K, so the total
time is Ω(K L log L) = Ω(n log n/K). The heap operations may take as
much as O(n log n). For a lower bound we have proven just Ω(n log n/K).
In the worst case, the merges always involve a short and long segment,
and the final segments are mostly of length cL. The time to compute the
cost of merging a length 2 segment with a length i segment is θ(i), and the
time to reach a length cL segment is
cL
i=2 θ(i) = θ(L2
). There are at most
n/cL such segments to compute, so the time is n/cL × θ(L2
) = O(Ln).
(Time for heap operations is inconsequential.) This complexity study is
summarized in Table 5.
In addition to the time complexity there are other features a practitioner
might consider when choosing an algorithm. First there is the question of
12 E. Keogh, S. Chu, D. Hart and M. Pazzani
Table 5. A feature summary for the 3 major algorithms.
Algorithm User can Online Complexity
specify1
Top-Down E, ME, K No O(n2K)
Bottom-Up E, ME, K No O(Ln)
Sliding Window E Yes O(Ln)
1KEY: E → Maximum error for a given segment, ME →
Maximum error for a given segment for entire time series,
K → Number of segments.
whether the algorithm is online or batch. Secondly, there is the question
of how the user can specify the quality of desired approximation. With
trivial modifications the Bottom-Up algorithm allows the user to specify
the desired value of K, the maximum error per segment, or total error
of the approximation. A (non-recursive) implementation of Top-Down can
also be made to support all three options. However Sliding Window only
allows the maximum error per segment to be specified.
3. Empirical Comparison of the Major
Segmentation Algorithms
In this section, we will provide an extensive empirical comparison of the
three major algorithms. It is possible to create artificial datasets that allow
one of the algorithms to achieve zero error (by any measure), but forces
the other two approaches to produce arbitrarily poor approximations. In
contrast, testing on purely random data forces the all algorithms to pro-
duce essentially the same results. To overcome the potential for biased
results, we tested the algorithms on a very diverse collection of datasets.
These datasets where chosen to represent the extremes along the fol-
lowing dimensions, stationary/non-stationary, noisy/smooth, cyclical/non-
cyclical, symmetric/asymmetric, etc. In addition, the data sets represent
the diverse areas in which data miners apply their algorithms, includ-
ing finance, medicine, manufacturing and science. Figure 3 illustrates the
10 datasets used in the experiments.
3.1. Experimental Methodology
For simplicity and brevity, we only include the linear regression versions
of the algorithms in our study. Since linear regression minimizes the sum
of squares error, it also minimizes the Euclidean distance (the Euclidean
Segmenting Time Series: A Survey and Novel Approach 13
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
(viii)
(ix)
(x)
Fig. 3. The 10 datasets used in the experiments. (i) Radio Waves. (ii) Exchange
Rates. (iii) Tickwise II. (iv) Tickwise I. (v) Water Level. (vi) Manufacturing. (vii) ECG.
(viii) Noisy Sine Cubed. (ix) Sine Cube. (x) Space Shuttle.
distance is just the square root of the sum of squares). Euclidean dis-
tance, or some measure derived from it, is by far the most common metric
used in data mining of time series [Agrawal et al. (1993), Agrawal et al.
(1995), Chan and Fu (1999), Das et al. (1998), Keogh et al. (2000), Keogh
and Pazzani (1999), Keogh and Pazzani (1998), Keogh and Smyth (1997),
Qu et al. (1998), Wang and Wang (2000)]. The linear interpolation ver-
sions of the algorithms, by definition, will always have a greater sum of
squares error.
We immediately encounter a problem when attempting to compare the
algorithms. We cannot compare them for a fixed number of segments, since
Sliding Windows does not allow one to specify the number of segments.
Instead we give each of the algorithms a fixed max error and measure the
total error of the entire piecewise approximation.
The performance of the algorithms depends on the value of max error.
As max error goes to zero all the algorithms have the same performance,
since they would produce n/2 segments with no error. At the opposite end,
as max error becomes very large, the algorithms once again will all have
the same performance, since they all simply approximate T with a single
best-fit line. Instead, we must test the relative performance for some rea-
sonable value of max error, a value that achieves a good trade off between
compression and fidelity. Because this “reasonable value” is subjective and
dependent on the data mining application and the data itself, we did the fol-
lowing. We chose what we considered a “reasonable value” of max error for
each dataset, and then we bracketed it with 6 values separated by powers of
two. The lowest of these values tends to produce an over-fragmented approx-
imation, and the highest tends to produce a very coarse approximation. So
in general, the performance in the mid-range of the 6 values should be
considered most important. Figure 4 illustrates this idea.
14 E. Keogh, S. Chu, D. Hart and M. Pazzani
Too fine an
approximation
“Correct”
approximation
Too coarse an
approximation
max_error = E × 24
max_error = E × 25
max_error = E × 26
max_error = E × 21
max_error = E × 22
max_error = E × 23
Fig. 4. We are most interested in comparing the segmentation algorithms at the set-
ting of the user-defined threshold max error that produces an intuitively correct level
of approximation. Since this setting is subjective we chose a value for E, such that
max error = E × 2i (i = 1 to 6), brackets the range of reasonable approximations.
Since we are only interested in the relative performance of the algo-
rithms, for each setting of max error on each data set, we normalized the
performance of the 3 algorithms by dividing by the error of the worst per-
forming approach.
3.2. Experimental Results
The experimental results are summarized in Figure 5. The most obvious
result is the generally poor quality of the Sliding Windows algorithm. With
a few exceptions, it is the worse performing algorithm, usually by a large
amount.
Comparing the results for Sine cubed and Noisy Sine supports our con-
jecture that the noisier a dataset, the less difference one can expect between
algorithms. This suggests that one should exercise caution in attempting
to generalize the performance of an algorithm that has only been demon-
strated on a single noisy dataset [Qu et al. (1998), Wang and Wang (2000)].
Top-Down does occasionally beat Bottom-Up, but only by small amount.
On the other hand Bottom-Up often significantly out performs Top-Down,
especially on the ECG, Manufacturing and Water Level data sets.
4. A New Approach
Given the noted shortcomings of the major segmentation algorithms, we
investigated alternative techniques. The main problem with the Sliding
Windows algorithm is its inability to look ahead, lacking the global view
of its offline (batch) counterparts. The Bottom-Up and the Top-Down
Segmenting Time Series: A Survey and Novel Approach 15
E*2
1
E*2
2
E*2
3
E*2
4
E*2
5
E*2
6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6 0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
Space Shuttle Sine Cubed Noisy Sine Cubed
ECG Manufacturing Water Level
Tickwise 1 Tickwise 2 Exchange Rate
Radio Waves
Fig. 5. A comparison of the three major times series segmentation algorithms, on ten
diverse datasets, over a range in parameters. Each experimental result (i.e. a triplet of
histogram bars) is normalized by dividing by the performance of the worst algorithm on
that experiment.
approaches produce better results, but are offline and require the scan-
ning of the entire data set. This is impractical or may even be unfeasible in
a data-mining context, where the data are in the order of terabytes or arrive
in continuous streams. We therefore introduce a novel approach in which
we capture the online nature of Sliding Windows and yet retain the supe-
riority of Bottom-Up. We call our new algorithm SWAB (Sliding Window
and Bottom-up).
4.1. The SWAB Segmentation Algorithm
The SWAB algorithm keeps a buffer of size w. The buffer size should ini-
tially be chosen so that there is enough data to create about 5 or 6 segments.
16 E. Keogh, S. Chu, D. Hart and M. Pazzani
Bottom-Up is applied to the data in the buffer and the leftmost segment
is reported. The data corresponding to the reported segment is removed
from the buffer and more datapoints are read in. The number of datapoints
read in depends on the structure of the incoming data. This process is per-
formed by the Best Line function, which is basically just classic Sliding
Windows. These points are incorporated into the buffer and Bottom-Up is
applied again. This process of applying Bottom-Up to the buffer, report-
ing the leftmost segment, and reading in the next “best fit” subsequence is
repeated as long as data arrives (potentially forever).
The intuition behind the algorithm is this. The Best Line function
finds data corresponding to a single segment using the (relatively poor)
Sliding Windows and gives it to the buffer. As the data moves through the
buffer the (relatively good) Bottom-Up algorithm is given a chance to refine
the segmentation, because it has a “semi-global” view of the data. By the
time the data is ejected from the buffer, the segmentation breakpoints are
usually the same as the ones the batch version of Bottom-Up would have
chosen. Table 6 shows the pseudo code for the algorithm.
Table 6. The SWAB (Sliding Window and Bottom-up) algorithm.
Algorithm
Algorithm
Algorithm Seg TS = SWAB(max error, seg num) // seg num is a small integer,
i.e. 5 or 6
read in w number of data points
read in w number of data points
read in w number of data points // Enough to approximate
lower bound = w / 2; // seg num of segments.
upper bound = 2 * w;
while
while
while data at input
T = Bottom Up(w, max error) // Call the Bottom-Up algorithm.
Seg TS = CONCAT(SEG TS, T(1));
w = TAKEOUT(w, w); // Deletes w points in T(1) from w.
if
if
if data at input // Add w points from BEST LINE() to w.
w = CONCAT(w, BEST LINE(max error));
{check upper and lower bound, adjust if necessary}
else
else
else // flush approximated segments from buffer.
Seg TS = CONCAT(SEG TS, (T - T(1)))
end;
end;
end;
end;
end;
end;
Function
Function
Function S = BEST LINE(max error) // returns S points to approximate.
while
while
while error ≤ max error // next potential segment.
read in one additional data point, d, into S
S = CONCAT(S, d);
error = approx segment(S);
end while;
end while;
end while;
return
return
return S;
Segmenting Time Series: A Survey and Novel Approach 17
Using the buffer allows us to gain a “semi-global” view of the data set for
Bottom-Up. However, it important to impose upper and lower bounds on
the size of the window. A buffer that is allowed to grow arbitrarily large will
revert our algorithm to pure Bottom-Up, but a small buffer will deteriorate
it to Sliding Windows, allowing excessive fragmentation may occur. In our
algorithm, we used an upper (and lower) bound of twice (and half) of the
initial buffer.
Our algorithm can be seen as operating on a continuum between the
two extremes of Sliding Windows and Bottom-Up. The surprising result
(demonstrated below) is that by allowing the buffer to contain just 5 or
6 times the data normally contained by is a single segment, the algorithm
produces essentially the same results as Bottom-Up, yet is able process
a never-ending stream of data. Our new algorithm requires only a small,
constant amount of memory, and the time complexity is a small constant
factor worse than that of the standard Bottom-Up algorithm.
4.2. Experimental Validation
We repeated the experiments in Section 3, this time comparing the new
algorithm with pure (batch) Bottom-Up and classic Sliding Windows. The
result, summarized in Figure 6, is that the new algorithm produces results
that are essentiality identical to Bottom-Up. The reader may be surprised
that SWAB can sometimes be slightly better than Bottom-Up. The reason
why this can occur is because SWAB is exploring a slight larger search
space. Every segment in Bottom-Up must have an even number of data-
points, since it was created by merging other segments that also had an even
number of segments. The only possible exception is the rightmost segment,
which can have an even number of segments if the original time series had
an odd length. Since this happens multiple times for SWAB, it is effectively
searching a slight larger search space.
5. Conclusions and Future Directions
We have seen the first extensive review and empirical comparison of time
series segmentation algorithms from a data mining perspective. We have
shown the most popular approach, Sliding Windows, generally produces
very poor results, and that while the second most popular approach, Top-
Down, can produce reasonable results, it does not scale well. In contrast,
the least well known, Bottom-Up approach produces excellent results and
scales linearly with the size of the dataset.
18 E. Keogh, S. Chu, D. Hart and M. Pazzani
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 6
1 2 3 4 5 6 1 2 3 4 5 6
E*2
1
E*2
2
E*2
3
E*2
4
E*2
5
E*2
6
Sine Cubed Noisy Sine Cubed
ECG Manufacturing Water Level
Tickwise 2 Exchange Rate
Radio Waves
Space Shuttle
Tickwise 1
Fig. 6. A comparison of the SWAB algorithm with pure (batch) Bottom-Up and classic
Sliding Windows, on ten diverse datasets, over a range in parameters. Each experimental
result (i.e. a triplet of histogram bars) is normalized by dividing by the performance of
the worst algorithm on that experiment.
In addition, we have introduced SWAB, a new online algorithm, which
scales linearly with the size of the dataset, requires only constant space and
produces high quality approximations of the data.
There are several directions in which this work could be expanded.
• The performance of Bottom-Up is particularly surprising given that it
explores a smaller space of representations. Because the initialization
phase of the algorithm begins with all line segments having length two,
all merged segments will also have even lengths. In contrast the two
other algorithms allow segments to have odd or even lengths. It would be
Segmenting Time Series: A Survey and Novel Approach 19
interesting to see if removing this limitation of Bottom-Up can improve
its performance further.
• For simplicity and brevity, we have assumed that the inner loop of the
SWAB algorithm simply invokes the Bottom-Up algorithm each time.
This clearly results in some computation redundancy. We believe we may
be able to reuse calculations from previous invocations of Bottom-Up,
thus achieving speedup.
Reproducible Results Statement: In the interests of competitive
scientific inquiry, all datasets and code used in this work are freely available
at the University of California Riverside, Time Series Data Mining Archive
{www.cs.ucr.edu/∼eamonn/TSDMA/index.html}.
References
1. Agrawal, R., Faloutsos, C., and Swami, A. (1993). Efficient Similarity Search
in Sequence Databases. Proceedings of the 4th Conference on Foundations of
Data Organization and Algorithms, pp. 69–84.
2. Agrawal, R., Lin, K.I., Sawhney, H.S., and Shim, K. (1995). Fast Similarity
Search in the Presence of Noise, Scaling, and Translation in Times-Series
Databases. Proceedings of 21th International Conference on Very Large Data
Bases, pp. 490–501.
3. Chan, K. and Fu, W. (1999). Efficient Time Series Matching by Wavelets.
Proceedings of the 15th IEEE International Conference on Data Engineering,
pp. 126–133.
4. Das, G., Lin, K. Mannila, H., Renganathan, G., and Smyth, P. (1998). Rule
Discovery from Time Series. Proceedings of the 3rd International Conference
of Knowledge Discovery and Data Mining, pp. 16–22.
5. Douglas, D.H. and Peucker, T.K. (1973). Algorithms for the Reduction of the
Number of Points Required to Represent a Digitized Line or its Caricature.
Canadian Cartographer, 10(2) December, pp. 112–122.
6. Duda, R.O. and Hart, P.E. (1973). Pattern Classification and Scene Analysis.
Wiley, New York.
7. Ge, X. and Smyth P. (2001). Segmental Semi-Markov Models for Endpoint
Detection in Plasma Etching. IEEE Transactions on Semiconductor Engi-
neering.
8. Heckbert, P.S. and Garland, M. (1997). Survey of Polygonal Surface Simpli-
fication Algorithms, Multiresolution Surface Modeling Course. Proceedings
of the 24th International Conference on Computer Graphics and Interactive
Techniques.
9. Hunter, J. and McIntosh, N. (1999). Knowledge-Based Event Detection in
Complex Time Series Data. Artificial Intelligence in Medicine, Springer,
pp. 271–280.
20 E. Keogh, S. Chu, D. Hart and M. Pazzani
10. Ishijima, M.. et al. (1983). Scan-Along Polygonal Approximation for Data
Compression of Electrocardiograms. IEEE Transactions on Biomedical Engi-
neering (BME), 30(11), 723–729.
11. Koski, A., Juhola, M., and Meriste, M. (1995). Syntactic Recognition of ECG
Signals By Attributed Finite Automata. Pattern Recognition, 28(12), 1927–
1940.
12. Keogh, E., Chakrabarti, K., Pazzani, M., and Mehrotra, S. (2000). Dimen-
sionality Reduction for Fast Similarity Search in Large Time Series
Databases. Journal of Knowledge and Information Systems, 3(3), 263–286.
13. Keogh, E. and Pazzani, M. (1998). An Enhanced Representation of Time
Series which Allows Fast and Accurate Classification, Clustering and Rele-
vance Feedback. Proceedings of the 4th International Conference of Knowl-
edge Discovery and Data Mining, AAAI Press, pp. 239–241.
14. Keogh, E. and Pazzani, M. (1999). Relevance Feedback Retrieval of Time
Series Data. Proceedings of the 22th Annual International ACM-SIGIR Con-
ference on Research and Development in Information Retrieval, pp. 183–190.
15. Keogh, E. and Smyth, P. (1997). A Probabilistic Approach to Fast Pattern
Matching in Time Series Databases. Proceedings of the 3rd International Con-
ference of Knowledge Discovery and Data Mining, pp. 24–20.
16. Last, M., Klein, Y., and Kandel, A. (2001). Knowledge Discovery in Time
Series Databases. IEEE Transactions on Systems, Man, and Cybernetics,
31B(1), 160–169.
17. Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., and Allan, J.
(2000). Mining of Concurrent Text and Time Series. Proceedings of the 6th
International Conference on Knowledge Discovery and Data Mining, 37–44.
18. Li, C,. Yu, P., and Castelli, V. (1998). MALM: A Framework for Mining
Sequence Database at Multiple Abstraction Levels. Proceedings of the 9th
International Conference on Information and Knowledge Management, pp.
267–272.
19. McKee, J.J, Evans, N.E, and Owens, F.J (1994). Efficient Implementation of
the Fan/SAPA-2 Algorithm Using Fixed Point Arithmetic. Automedica, 16,
109–117.
20. Osaki, R., Shimada, M., and Uehara, K. (1999). Extraction of Primitive
Motion for Human Motion Recognition. Proceedings of the 2nd International
Conference on Discovery Science, pp. 351–352.
21. Park, S., Kim, S.W, and Chu, W.W (2001). Segment-Based Approach for
Subsequence Searches in Sequence Databases. Proceedings of the 16th ACM
Symposium on Applied Computing, pp. 248–252.
22. Park, S., Lee, D., and Chu, W.W (1999). Fast Retrieval of Similar Subse-
quences in Long Sequence Databases. Proceedings of the 3rd IEEE Knowledge
and Data Engineering Exchange Workshop.
23. Pavlidis, T. (1976). Waveform Segmentation Through Functional Approxi-
mation. IEEE Transactions on Computers, pp. 689–697.
24. Perng, C., Wang, H., Zhang, S., and Parker, S. (2000). Landmarks: A New
Model for Similarity-Based Pattern Querying in Time Series Databases. Pro-
ceedings of 16th International Conference on Data Engineering, pp. 33–45.
Segmenting Time Series: A Survey and Novel Approach 21
25. Qu, Y., Wang, C., and Wang, S. (1998). Supporting Fast Search in
Time Series for Movement Patterns in Multiples Scales, Proceedings of the
7th International Conference on Information and Knowledge Management,
pp. 251–258.
26. Ramer, U. (1972). An Iterative Procedure for the Polygonal Approximation
of Planar Curves. Computer Graphics and Image Processing, 1, 244–256.
27. Shatkay, H. (1995). Approximate Queries and Representations for Large
Data Sequences. Technical Report cs-95-03, Department of Computer Sci-
ence, Brown University.
28. Shatkay, H. and Zdonik, S. (1996). Approximate Queries and Representa-
tions for Large Data Sequences. Proceedings of the 12th IEEE International
Conference on Data Engineering, pp. 546–553.
29. Sugiura, N. and Ogden, R.T (1994). Testing Change-Points with Linear
Trend. Communications in Statistics B: Simulation and Computation, 23,
287–322.
30. Vullings, H.J L.M., Verhaegen, M.H.G., and Verbruggen H.B. (1997). ECG
Segmentation Using Time-Warping. Proceedings of the 2nd International
Symposium on Intelligent Data Analysis, pp. 275–286.
31. Wang, C. and Wang, S. (2000). Supporting Content-Based Searches on Time
Series Via Approximation. Proceedings of the 12th International Conference
on Scientific and Statistical Database Management, pp. 69–81.
This page intentionally left blank
CHAPTER 2
A SURVEY OF RECENT METHODS FOR EFFICIENT
RETRIEVAL OF SIMILAR TIME SEQUENCES
Magnus Lie Hetland
Norwegian University of Science and Technology
Sem Sælands vei 7–9
NO-7491 Trondheim, Norway
E-mail: magnus@hetland.org
Time sequences occur in many applications, ranging from science and
technology to business and entertainment. In many of these applica-
tions, searching through large, unstructured databases based on sample
sequences is often desirable. Such similarity-based retrieval has attracted
a great deal of attention in recent years. Although several different
approaches have appeared, most are based on the common premise of
dimensionality reduction and spatial access methods. This chapter gives
an overview of recent research and shows how the methods fit into a
general context of signature extraction.
Keywords: Information retrieval; sequence databases; similarity search;
spatial indexing; time sequences.
1. Introduction
Time sequences arise in many applications—any applications that involve
storing sensor inputs, or sampling a value that changes over time. A problem
which has received an increasing amount of attention lately is the problem
of similarity retrieval in databases of time sequences, so-called “query by
example.” Some uses of this are [Agrawal et al. (1993)]:
• Identifying companies with similar patterns of growth.
• Determining products with similar selling patterns.
• Discovering stocks with similar movement in stock prices.
23
24 M. L. Hetland
• Finding out whether a musical score is similar to one of a set of copy-
righted scores.
• Finding portions of seismic waves that are not similar to spot geological
irregularities.
Applications range from medicine, through economy, to scientific disci-
plines such as meteorology and astrophysics [Faloutsos et al. (1994), Yi and
Faloutsos (2000)].
The running times of simple algorithms for comparing time sequences
are generally polynomial in the length of both sequences, typically linear or
quadratic. To find the correct offset of a query in a large database, a naive
sequential scan will require a number of such comparisons that is linear in
the length of the database. This means that, given a query of length m and
a database of length n, the search will have a time complexity of O(nm),
or even O(nm2
) or worse. For large databases this is clearly unacceptable.
Many methods are known for performing this sort of query in the domain
of strings over finite alphabets, but with time sequences there are a few extra
issues to deal with:
• The range of values is not generally finite, or even discrete.
• The sampling rate may not be constant.
• The presence of noise in various forms makes it necessary to support very
flexible similarity measures.
This chapter describes some of the recent advances that have been made
in this field; methods that allow for indexing of time sequences using flexible
similarity measures that are invariant under a wide range of transformations
and error sources.
The chapter is structured as follows: Section 2 gives a more formal
presentation of the problem of similarity-based retrieval and the so-called
dimensionality curse; Section 3 describes the general approach of signature
based retrieval, or shrink and search, as well as three specific methods using
this approach; Section 4 shows some other approaches, while Section 5
concludes the chapter. Finally, Appendix gives an overview of some basic
distance measures.1
1The term “distance” is used loosely in this paper. A distance measure is simply the
inverse of a similarity measure and is not required to obey the metric axioms.
A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 25
1.1. Terminology and Notation
A time sequence 
x = x1 = (v1, t1), . . . , xn = (vn, tn) is an ordered col-
lection of elements xi, each consisting of a value vi and a timestamp ti.
Abusing the notation slightly, the value of xi may be referred to as xi.
For some retrieval methods, the values may be taken from a finite class
of values [Mannila and Ronkainen (1997)], or may have more than one
dimension [Lee et al. (2000)], but it is generally assumed that the values
are real numbers. This assumption is a requirement for most of the methods
described in this chapter.
The only requirement of the timestamps is that they be non-decreasing
(or, in some applications, strictly increasing) with respect to the sequence
indices:
ti ≤ tj ⇔ i ≤ j. (1)
In some methods, an additional assumption is that the elements are
equi-spaced: for every two consecutive elements xi and xi+1 we have
ti+1 − ti = ∆, (2)
where ∆ (the sampling rate of 
x) is a (positive) constant. If the actual
sampling rate is not important, ∆ may be normalized to 1, and t1 to 0. It
is also possible to resample the sequence to make the elements equi-spaced,
when required.
The length of a time sequence 
x is its cardinality, written as |
x|. The
contiguous subsequence of 
x containing elements xi to xj (inclusive) is
written xi:j. A signature of a sequence 
x is some structure that somehow
represents 
x, yet is simpler than 
x. In the context of this chapter, such
a signature will always be a vector of fixed size k. (For a more thorough
discussion of signatures, see Section 3.) Such a signature is written x. For
a summary of the notation, see Table 1.
Table 1. Notation.

x A sequence
x̃ A signature of 
x
xi Element number i of 
x
xi:j Elements i to j (inclusive) of 
x
|
x| The length of 
x
26 M. L. Hetland
2. The Problem
The problem of retrieving similar time sequences may be stated as follows:
Given a sequence 
q, a set of time sequences X, a (non-negative) distance
measure d, and a tolerance threshold ε, find the set R of sequences closer
to 
q than ε, or, more precisely:
R = {
x ∈ X|d(
q, 
x) ≤ ε}. (3)
Alternatively, one might wish to find the k nearest neighbours of 
q, which
amounts to setting ε so that |R| = k. The parameter ε is typically supplied
by the user, while the distance function d is domain-dependent. Several
distance measures will be described rather informally in this chapter. For
more formal definitions, see Appendix.
Figure 1 illustrates the problem for Euclidean distance in two dimen-
sions. In this example, the vector 
x will be included in the result set R,
while 
y will not.
A useful variation of the problem is to find a set of subsequences of the
sequences in X. This, in the basic case, requires comparing 
q not only to
all elements of X, but to all possible subsequences.2
If a method retrieves a subset S of R, the wrongly dismissed sequences
in R − S are called false dismissals. Conversely, if S is a superset of R, the
sequences in S − R are called false alarms.
Fig. 1. Similarity retrieval.
2Except in the description of LCS in Appendix, subsequence means contiguous subse-
quence, or segment.
A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 27
2.1. Robust Distance Measures
The choice of distance measure is highly domain dependent, and in some
cases a simple Lp norm such as Euclidean distance may be sufficient. How-
ever, in many cases, this may be too brittle [Keogh and Pazzani (1999b)]
since it does not tolerate such transformations as scaling, warping, or trans-
lation along either axis. Many of the newer retrieval methods focus on using
more robust distance measures, which are invariant under such transforma-
tions as time warping (see Appendix for details) without loss of perfor-
mance.
2.2. Good Indexing Methods
Faloutsos et al. (1994) list the following desirable properties for an indexing
method:
(i) It should be faster than a sequential scan.
(ii) It should incur little space overhead.
(iii) It should allow queries of various length.
(iv) It should allow insertions and deletions without rebuilding the index.
(v) It should be correct: No false dismissals must occur.
To achieve high performance, the number of false alarms should also be
low. Keogh et al. (2001b) add the following criteria to the list above:
(vi) It should be possible to build the index in reasonable time.
(vii) The index should preferably be able to handle more than one distance
measure.
2.3. Spatial Indices and the Dimensionality Curse
The general problem of similarity based retrieval is well known in the field of
information retrieval, and many indexing methods exist to process queries
efficiently [Baeza-Yates and Ribeiro-Neto (1999)]. However, certain prop-
erties of time sequences make the standard methods unsuitable. The fact
that the value ranges of the sequences usually are continuous, and that
the elements may not be equi-spaced, makes it difficult to use standard
text-indexing techniques such as suffix-trees. One of the most promising
techniques is multidimensional indexing (R-trees [Guttman (1984)], for
instance), in which the objects in question are multidimensional vectors,
and similar objects can be retrieved in sublinear time. One requirement of
such spatial access methods is that the distance measure must be monotonic
28 M. L. Hetland
in all dimensions, usually satisfied through the somewhat stricter require-
ment of the triangle inequality (d(
x, 
z) ≤ d(
x, 
y) + d(
y, 
z)).
One important problem that occurs when trying to index sequences with
spatial access methods is the so-called dimensionality curse: Spatial indices
typically work only when the number of dimensions is low [Chakrabarti
and Mehrotra (1999)]. This makes it unfeasible to code the entire sequence
directly as a vector in an indexed space.
The general solution to this problem is dimensionality reduction: to
condense the original sequences into signatures in a signature space of low
dimensionality, in a manner which, to some extent, preserves the distances
between them. One can then index the signature space.
3. Signature Based Similarity Search
A time sequence 
x of length n can be considered a vector or point in an
n-dimensional space. Techniques exist (spatial access methods, such as the
R-tree and variants [Chakrabarti and Mehrotra (1999), Wang and Perng
(2001), Sellis et al. (1987)] for indexing such data. The problem is that
the performance of such methods degrades considerably even for relatively
low dimensionalities [Chakrabarti and Mehrotra (1999)]; the number of
dimensions that can be handled is usually several orders of magnitude lower
than the number of data points in a typical time sequence.
A general solution described by Faloutsos et al. (1994; 1997) is to extract
a low-dimensional signature from each sequence, and to index the signature
space. This shrink and search approach is illustrated in Figure 2.
Fig. 2. The signature based approach.
A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 29
An important result given by Faloutsos et al. (1994) is the proof that in
order to guarantee completeness (no false dismissals), the distance function
used in the signature space must underestimate the true distance mea-
sure, or:
dk(x̃, ỹ) ≤ d(
x, 
y). (4)
This requirement is called the bounding lemma. Assuming that (1.4)
holds, an intuitive way of stating the resulting situation is: “if two signa-
tures are far apart, we know the corresponding [sequences] must also be far
apart” [Faloutsos et al. (1997)]. This, of course, means that there will be
no false dismissals. To minimise the number of false alarms, we want dk to
approximate d as closely as possible. The bounding lemma is illustrated in
Figure 3.
This general method of dimensionality reduction may be summed up as
follows [Keogh et al. (2001b)]:
1. Establish a distance measure d from a domain expert.
2. Design a dimensionality reduction technique to produce signatures of
length k, where k can be efficiently handled by a standard spatial access
method.
3. Produce a distance measure dk over the k-dimensional signature space,
and prove that it obeys the bounding condition (4).
In some applications, the requirement in (4) is relaxed, allowing for a
small number of false dismissals in exchange for increased performance.
Such methods are called approximate.
The dimensionality reduction may in itself be used to speed up the
sequential scan, and some methods (such as the piecewise linear approxi-
mation of Keogh et al., which is described in Section 4.2) rely only on this,
without using any index structure.
Fig. 3. An intuitive view of the bounding lemma.
30 M. L. Hetland
Methods exist for finding signatures of arbitrary objects, given the dis-
tances between them [Faloutsos and Lin (1995), Wang et al. (1999)], but
in the following I will concentrate on methods that exploit the structure of
the time series to achieve good approximations.
3.1. A Simple Example
As an example of the signature based scheme, consider the two sequences
shown in Figure 4.
The sequences, 
x and 
y, are compared using the L1 measure (Manhattan
distance), which is simply the sum of the absolute distances between each
aligning pair of values. A simple signature in this scheme is the prefix of
length 2, as indicated by the shaded area in the figure. As shown in Figure 5,
these signatures may be interpreted as points in a two-dimensional plane,
which can be indexed with some standard spatial indexing method. It is
also clear that the signature distance will underestimate the real distance
between the sequences, since the remaining summands of the real distance
must all be positive.
Fig. 4. Comparing two sequences.
Fig. 5. A simple signature distance.
A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 31
Fig. 6. An example time sequence.
Although correct, this simple signature extraction technique is not par-
ticularly precise. The signature extraction methods introduced in the fol-
lowing sections take into account more information about the full sequence
shape, and therefore lead to fewer false alarms.
Figure 6 shows a time series containing measurements of atmospheric
pressure. In the following three sections, the methods described will be
applied to this sequence, and the resulting simplified sequence (recon-
structed from the extracted signature) will be shown superimposed on the
original.
3.2. Spectral Signatures
Some of the methods presented in this section are not very recent, but
introduce some of the main concepts used by newer approaches.
Agrawal et al. (1993) introduce a method called the F-index in which a
signature is extracted from the frequency domain of a sequence. Underlying
their approach are two key observations:
• Most real-world time sequences can be faithfully represented by their
strongest Fourier coefficients.
• Euclidean distance is preserved in the frequency domain (Parseval’s
Theorem [Shatkay (1995)]).
Based on this, they suggest performing the Discrete Fourier Transform
on each sequence, and using a vector consisting of the sequence’s k first
amplitude coefficients as its signature. Euclidean distance in the signa-
ture space will then underestimate the real Euclidean distance between
the sequences, as required.
Figure 7 shows an approximated time sequence, reconstructed from a
signature consisting of the original sequence’s ten first Fourier components.
This basic method allows only for whole-sequence matching. In 1994,
Faloutsos et al. introduce the ST-index, an improvement on the F-index
32 M. L. Hetland
Fig. 7. A sequence reconstructed from a spectral signature.
that makes subsequence matching possible. The main steps of the approach
are as follows:
1. For each position in the database, extract a window of length w, and
create a spectral signature (a point) for it.
Each point will be close to the previous, because the contents of the
sliding window change slowly. The points for one sequence will therefore
constitute a trail in signature space.
2. Partition the trails into suitable (multidimensional) Minimal Bounding
Rectangles (MBRs), according to some heuristic.
3. Store the MBRs in a spatial index structure.
To search for subsequences similar to a query 
q of length w, simply
look up all MBRs that intersect a hypersphere with radius ε around the
signature point q̃. This is guaranteed not to produce any false dismissals,
because if a point is within a radius of ε of q̃, it cannot possibly be contained
in an MBR that does not intersect the hypersphere.
To search for sequences longer than w, split the query into w-length
segments, search for each of them, and intersect the result sets. Because
a sequence in the result set R cannot be closer to the full query sequence
than it is to any one of the window signatures, it has to be close to all of
them, that is, contained in all the result sets.
These two papers [Agrawal et al. (1993) and Faloutsos et al. (1994)]
are seminal; several newer approaches are based on them. For example,
Rafiei and Mendelzon (1997) show how the method can be made more
robust by allowing various transformations in the comparison, and Chan
and Fu (1999) show how the Discrete Wavelet Transform (DWT) can be
used instead of the Discrete Fourier Transform (DFT), and that the DWT
method is empirically superior. See Wu et al. (2000) for a comparison
between similarity search based on DFT and DWT.
A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 33
3.3. Piecewise Constant Approximation
An approach independently introduced by Yi and Faloutsos (2000) and
Keogh et al. (2001b), Keogh and Pazzani (2000) is to divide each sequence
into k segments of equal length, and to use the average value of each seg-
ment as a coordinate of a k-dimensional signature vector. Keogh et al. call
the method Piecewise Constant Approximation, or PCA. This deceptively
simple dimensionality reduction technique has several advantages [Keogh
et al. (2001b)]: The transform itself is faster than most other transforms,
it is easy to understand and implement, it supports more flexible distance
measures than Euclidean distance, and the index can be built in linear time.
Figure 8 shows an approximated time sequence, reconstructed from a
ten-dimensional PCA signature.
Yi and Faloutsos (2000) also show that this signature can be used with
arbitrary Lp norms without changing the index structure, which is some-
thing no previous method [such as Agrawal et al. (1993; 1995), Faloutsos
et al. (1994; 1997), Rafiei and Mendelzon (1997), or Yi et al. (1998)] could
accomplish. This means that the distance measure may be specified by the
user. Preprocessing to make the index more robust in the face of such trans-
formations as offset translation, amplitude scaling, and time scaling can also
be performed.
Keogh et al. demonstrate that the representation can also be used with
the so-called weighted Euclidean distance, where each part of the sequence
has a different weight.
Empirically, the PCA methods seem promising: Yi and Faloutsos
demonstrate up to a ten times speedup over methods based on the discrete
wavelet transform. Keogh et al. do not achieve similar speedups, but point
to the fact that the structure allows for more flexible distance measures
than many of the competing methods.
Keogh et al. (2001a) later propose an improved version of the PCA, the
so-called Adaptive Piecewise Constant Approximation, or APCA. This is
Fig. 8. A sequence reconstructed from a PCA signature.
34 M. L. Hetland
similar to the PCA, except that the segments need not be of equal length.
Thus regions with great fluctuations may be represented with several short
segments, while reasonably featureless regions may be represented with
fewer, long segments. The main contribution of this representation is that
it is a more effective compression than the PCA, while still representing the
original faithfully.
Two distance measures are developed for the APCA, one which is guar-
anteed to underestimate Euclidean distance, and one which can be cal-
culated more efficiently, but which may generate some false dismissals. It
is also shown that this technique, like the PCA, can handle arbitrary Lp
norms. The empirical data suggest that the APCA outperforms both meth-
ods based on the discrete Fourier transform, and methods based on the dis-
crete wavelet transform with a speedup of one to two orders of magnitude.
In a recent paper, Keogh (2002) develops a distance measure that is
a lower bound for dynamic time warping, and uses the PCA approach to
index it. The distance measure is based on the assumption that the allowed
warping is restricted, which is often the case in real applications. Under this
assumption, Keogh constructs two warped versions of the sequence to be
indexed: An upper and a lower limit. The PCA signatures of these limits are
then extracted, and together with Keogh’s distance measure form an exact
index (one with no false dismissals) with high precision. Keogh performs
extensive empirical experiments, and his method clearly outperforms any
other existing method for indexing time warping.
3.4. Landmark Methods
In 1997, Keogh and Smyth introduce a probabilistic method for sequence
retrieval, where the features extracted are characteristic parts of the
sequence, so-called feature shapes. Keogh (1997) uses a similar landmark
based technique. Both these methods also use the dimensionality reduction
technique of piecewise linear approximation (see Section 4.2) as a prepro-
cessing step. The methods are based on finding similar landmark features
(or shapes) in the target sequences, ignoring shifting and scaling within
given limits. The technique is shown to be significantly faster than sequen-
tial scanning (about an order of magnitude), which may be accounted for by
the compression of the piecewise linear approximation. One of the contribu-
tions of the method is that it is one of the first that allows some longitudinal
scaling.
A more recent paper by Perng et al. (2000) introduces a more general
landmark model. In its most general form, the model allows any point of
A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 35
Fig. 9. A landmark approximation.
great importance to be identified as a landmark. The specific form used
in the paper defines an nth order landmark of a one-dimensional function
to be a point where the function’s nth derivative is zero. Thus, first-order
landmarks are extrema, second-order landmarks are inflection points, and so
forth. A smothing technique is also introduced, which lets certain landmarks
be overshadowed by others. For instance, local extrema representing small
fluctuations may not be as important as a global maximum or minimum.
Figure 9 shows an approximated time sequence, reconstructed from a
twelve-dimensional landmark signature.
One of the main contributions of Perng et al. (2000) is to show that for
suitable selections of landmark features, the model is invariant with respect
to the following transformations:
• Shifting
• Uniform amplitude scaling
• Uniform time scaling
• Non-uniform time scaling (time warping)
• Non-uniform amplitude scaling
It is also possible to allow for several of these transformations at once,
by using the intersection of the features allowed for each of them. This
makes the method quite flexible and robust, although as the number of
transformations allowed increases, the number of features will decrease;
consequently, the index will be less precise.
A particularly simple landmark based method (which can be seen as a
special case of the general landmark method) is introduced by Kim et al.
(2001). They show that by extracting the minimum, maximum, and the
first and last elements of a sequence, one gets a (rather crude) signature
that is invariant to time warping. However, since time warping distance
does not obey the triangle inequality [Yi et al. (1998)], it cannot be used
directly. This problem is solved by developing a new distance measure that
underestimates the time warping distance while simultaneously satisfying
36 M. L. Hetland
the triangle inequality. Note that this method does not achieve results com-
parable to those of Keogh (2002).
4. Other Approaches
Not all recent methods rely on spatial access methods. This section contains
a sampling of other approaches.
4.1. Using Suffix Trees to Avoid Redundant Computation
Baeza-Yates and Gonnet (1999) and Park et al. (2000) independently intro-
duce the idea of using suffix trees [Gusfield (1997)] to avoid duplicate cal-
culations when using dynamic programming to compare a query sequence
with other sequences in a database. Baeza-Yates and Gonnet use edit dis-
tance (see Appendix for details), while Park et al. use time warping.
The basic idea of the approach is as follows: When comparing two
sequences 
x and 
y with dynamic programming, a subtask will be to compare
their prefixes x1:i and y1:j. If two other sequences are compared that have
identical prefixes to these (for instance, the query and another sequence
from the database), the same calculations will have to be performed again.
If a sequential search for subsequence matches is performed, the cost may
easily become prohibitive.
To avoid this, all the sequences in the database are indexed with a suffix
tree. A suffix tree stores all the suffixes of a sequence, with identical pre-
fixes stored only once. By performing a depth-first traversal of the suffix
tree one can access every suffix (which is equivalent to each possible subse-
quence position) and backtrack to reuse the calculations that have already
been performed for the prefix that the current and the next candidate sub-
sequence share.
Baeza-Yates and Gonnet assume that the sequences are strings over
a finite alphabet; Park et al. avoid this assumption by classifying each
sequence element into one of a finite set of categories. Both methods achieve
subquadratic running times.
4.2. Data Reduction through Piecewise Linear
Approximation
Keogh et al. have introduced a dimensionality reduction technique using
piecewise linear approximation of the original sequence data [Keogh (1997),
Keogh and Pazzani (1998), Keogh and Pazzani (1999a), Keogh and Pazzani
(1999b), Keogh and Smyth (1997)]. This reduces the number of data
A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 37
points by a compression factor typically in the range from 10 to 600 for
real data [Keogh (1997)], outperforming methods based on the Discrete
Fourier Transform by one to three orders of magnitude [Keogh and Pazzani
(1999b)]. This approximation is shown to be valid under several distance
measures, including dynamic time warping distance [Keogh and Pazzani
(1999b)]. An enhanced representation is introduced in [Keogh and Pazzani
(1998)], where every line segment in the approximation is augmented with
a weight representing its relative importance; for instance, a combined
sequence may be constructed representing a class of sequences, and some
line segments may be more representative of the class than others.
4.3. Search Space Pruning through Subsequence Hashing
Keogh and Pazzani (1999a) describe an indexing method based on hashing,
in addition to the piecewise linear approximation. An equi-spaced template
grid window is moved across the sequence, and for each position a hash key
is generated to decide into which bin the corresponding subsequence is put.
The hash key is simply a binary string, where 1 means that the sequence
is predominantly increasing in the corresponding part of the template grid,
while 0 means that it is decreasing. These bin keys may then be used during
a search, to prune away entire bins without examining their contents. To get
more benefit from the bin pruning, the bins are arranged in a best-first order.
5. Conclusion
This chapter has sought to give an overview of recent advances in the field of
similarity based retrieval in time sequence databases. First, the problem of
similarity search and the desired properties of robust distance measures and
good indexing methods were outlined. Then, the general approach of signa-
ture based similarity search was described. Following the general descrip-
tion, three specific signature extraction approaches were discussed: Spectral
signatures, based on Fourier components (or wavelet components); piece-
wise constant approximation, and the related method adaptive piecewise
constant approximation; and landmark methods, based on the extraction of
significant points in a sequence. Finally, some methods that are not based
on signature extraction were mentioned.
Although the field of time sequence indexing has received much atten-
tion and is now a relatively mature field [Keogh et al. (2002)] there are still
areas where further research might be warranted. Two such areas are (1)
thorough empirical comparisons and (2) applications in data mining.
Exploring the Variety of Random
Documents with Different Content
Auf dem von A r i s t o t e l e s (Histor. animal. 8, 28) überlieferten
Sprichworte: ἀεὶ φέρει τι Λιβύη καινόν, immer bringt Afrika etwas
Neues beruht:
Quid novi ex Africa?
Was giebt es Neues aus Afrika?
(vrgl. A r i s t o t. de generat. animal. 2, 5, A n a x i l a s, Komödiendichter um 350
v. Chr. bei A t h e n. 14, p. 623 E., P l i n. Nat. hist. 8, 17: vulgare Graeciae
dictum: semper aliquid novi Africam afferre und N i c e p h o r u s G r e g o r a s [um
1350] Histor. Byzant., p. 805, 23, ed. Schopen).—
A r i s t o t e l e s (de anima 3, 4) sagt: ὥσπερ ἐν γραμματείῳ ᾧ
μηδὲν ὑπάρχει ἐντελεχείᾳ γεγραμμένον (wie auf einer Tafel, auf
der wirklich nichts geschrieben ist). Hierzu fügt Trendelenburg das
Wort A l e x a n d e r s a u s A p h r o d i s i a s (um 200 v. Chr.): ὁ
νοῦς ... ἐοικὼς πινακίδι ἀγράφῳ (die Vernunft, einer
unbeschriebenen Tafel gleichend), das P l u t a r c h Aussprüche d.
Philos. 4, 11 (χαρτίον, Blatt für Tafel setzend) den Stoikern
zuschrieb. Wir citieren lateinisch
Tabula rasa,
abgewischte Schreibtafel;
was nach Prantl (Gesch. d. Logik) zuerst bei Ä g i d i u s a
C o l u m n i s († 1316) vorkommt.
Tabellae rasae lesen wir zwar schon bei O v i d (Ars Amandi 1, 437) aber ohne
jene Beziehung auf Geistiges.—
A r i s t o t e l e s (Problemata 30, 1) fragt: Διὰ τί πάντες ὅσοι
περιττοὶ γεγόνασιν ἄνδρες, ἢ κατὰ φιλοσοφίαν, ἢ πολιτικὴν, ἢ
ποίησιν, ἢ τέχνας, φαίνονται μελαγχολικοὶ ὄντες ... Woher kommt
es, dass all' die Leute, die sich in der Philosophie, oder in der Politik,
oder in der Poesie, oder in den Künsten auszeichneten, offenbar
Melancholiker sind? Hieraus bildete Seneca (de tranquill, anim. 17,
10) den uns geläufigen Satz:
Nullum magnum ingenium sine mixtura dementiae
fuit.
Es hat keinen grossen Geist ohne eine Beimischung von
Wahnsinn gegeben.—
Im A r i s t o t e l e s (Oekonom. 1, 6) lesen wir: Καὶ τὸ τοῦ Πέρσου,
καὶ τὸ Λίβυος ἀπόφθεγμα εὖ ἂν ἔχοι· ὁ μὲν γὰρ ἐρωτηθεὶς τί μάλιστα
ἵππον πιαίνει,
ὁ τοῦ δεσπότου ὀφθαλμὸς
ἔφη· ὁ δὲ Λίβυος, ἐρωτηθεὶς ποία κόπρος ἀρίστη, τὰ τοῦ δεσπότου
ἴχνη, ἔφη. Sowohl des Persers, wie des Libyers Ausspruch ist gut,
denn Jener sagte auf die Frage, was ein Pferd am Besten mäste:
Das Auge des Herrn;
während der Libyer auf die Frage, welcher Dünger am Besten sei,
sagte: des Herrn Fussstapfen. C o l u m e l l a (4, 18) vermengt diese
Worte, indem er schreibt: oculos et vestigia domini res agro
saluberrimas, die Augen und Fussstapfen des Herrn seien die
heilsamsten Dinge für den Acker, und P l i n i u s (Nat. hist., 18, 2)
kürzt dies also: majores fertilissimum in agro
oculum domini
esse dixerunt.—Die Altvordern sagten, am fruchtbringendsten für
den Acker sei das Auge des Herrn.—
Im A r i s t o t e l e s (Analyt. prior. B. 18 p. 66 ed. Bekker) steht: Ὁ
δὲ ψευδὴς λόγος γίνεται παρὰ τὸ πρῶτον ψεῦδος, der falsche Satz
entspringt dem falschen Grundgedanken oder die falsche
Conclusion der falschen Prämisse. Hieraus stammt für
Grundirrtum
Das πρῶτον ψεῦδος,
das wir jedoch nach dem Sprachgebrauch, der ψεῦδος nicht als
Irrtum sondern als absichtliche Täuschung nimmt, oft als
Grundbetrug oder Urlüge aufzufassen und theologisch
anzuwenden geneigt sind.—
Theophrast (um 372-287 v. Chr.) pflegte (nach Diogen. Laërt. V.
2 n. 10, 40) zu sagen: πολυτελὲς ἀνάλωμα εἶναι τὸν χρόνον, Zeit
sei eine kostbare Ausgabe. Hieraus scheint hergeleitet:
Zeit ist Geld,
was wir auch englisch ausdrücken:
Time is money.
In Bacons Essayes (Of Dispatch 1620) heisst es: Time is the
measure of business, as money is of wares: and business is bought
at a deare hand, where there is small dispatch (Zeit ist der
Arbeitmesser, wie Geld der Waarenmesser ist: und Arbeit wird teuer,
wenn man nicht sehr eilt).—
Der Redner Pytheas (um 340 v. Chr.) sagte (nach Plutarch
Staatslehren 6 n. Demosthenes 8, sowie nach Aelian variae
hist. 7, 7) von den Reden des von ihm unaufhörlich angefeindeten
Demosthenes, dass sie nach Lampendochten röchen (ἐλλυχνίων
ὄζειν) und noch heute sagen wir
nach der Lampe riechen
von jeder litterarischen Arbeit, welche ohne Anmut der Form
nächtliches Studium verrät.—
Bei S t o b ä u s (Serm. LXVI, p. 419. Gesn.) finden wir des
Menander (342-290 v. Chr.):
Τὸ γαμεῖν, ἐάν τις τὴν ἀλήθειαν σκοπῇ,
Κακὸν μέν ἐστιν, ἀλλ' ἀναγκαῖον κακόν.
Heiraten ist, wenn man die Wahrheit prüft,
Ein Übel, aber ein
notwendiges Übel.
M a l u m n e c e s s a r i u m, die lat. Übersetzung, steht in des L a m p r i d i u s (4.
Jahrh. n. Chr.) Alexander Severus 46.—
P l u t a r c h überliefert uns in der Trostrede an Apollonius, dessen
Sohn gestorben war, (p. 119e
; cap. 34) den Vers des M e n a n dn:
Ὃν οἱ θεοὶ φιλοῦσιν ἀποθνήσκει νέος,
den P l a u t u s (Bacch. 4, 7, 18) also übersetzt:
quem di diligunt adolescens moritur
und der bei uns zu lauten pflegt:
Wen die Götter lieben, der stirbt jung.—
M e n a n d e r s Wort ἀνεῤῥίφθω κύβος (der Würfel falle!—Überl.
v. Athenäus XIII, p. 559 c.) citierte C ä s a r, als er 49 v. Chr. den
Rubicon überschritt, in griechischer Sprache, wie Plutarch
(Pompeius, 60 und Ausspr. v. Kön. u. Feldh.) ausdrücklich
hervorhebt. Sueton hingegen lässt ihn lateinisch sagen (Caesar
32):
Alea iacta est!
Der Würfel ist gefallen!
(Erasmus verbessert: Iacta esto alea! Der Würfel falle!) Huttens
Wahlspruch (s. Kap. III) Jacta est alea hat hier seine Quelle.—
Die 422. Gnome der Monostichen des M e n a n d e r
Ὁ μὴ δαρεὶς ἄνθρωπος οὐ παιδεύεται
Wer nicht geschunden wird, wird nicht erzogen
stellte G o e t h e als Motto vor den 1. Teil seiner Selbstbiographie.—
Eine Komödie M e n a n d e r s
Ἑαυτὸν τιμωρούμενος
kam auf uns durch des Te r e n z Komödie
Heautontimorumenos,
Der Selbstpeiniger.
Die nach D i o g e n e s L a ë r t i u s (VII, 1 n. 19, 23) von dem
Stoiker Zeno (geb. 340 v. Chr.) aufgestellte (von P o r p h y r i u s im
Leben des Pythagoras aber auf diesen zurückgeführte, in
P l u t a r c h s Schrift Die Menge der Freunde und in dem
P s e u d o - A r i s t o t e l i s c h e n Buch Magna Moralia II, 15
citierte) Definition des Freundes Ἄλλος ἐγώ wenden wir an in der
lateinischen und deutschen Form:
Alter ego,
Ein zweites Ich.
Bei C i c e r o findet sich me alterum ad. fam. 7, 5, 1; ad Attic. 3, 15, 4; 4, 1,
7; Alterum me ad fam. 2, 15, 4; verus amicus est tanquam alter idem de
amic. 21, 80; bei Ausonius alter ego praef. 2, 42 (4. Jahrh. n. Chr.). Der
griechische Romanschreiber E u s t a t h i u s [6. Jahrh.? 12. Jahrh.?] sagt dreist von
sich: Ein zweites Ich; denn also bezeichne ich den Freund. H e r c h e r Erotici
Graeci 2, p. 164, 25; vrgl. 165, 18. Späterhin nahm Alter ego die Bedeutung
eines Stellvertreters der souveränen Gewalt an.—
Am Schlusse jeder Beweisführung des Mathematikers Euklid (bl.
um 300 v. Chr.) heisst es:
ὅπερ ἔδει δεῖξαι,
quod erat demonstrandum,
was zu beweisen war.—
Des (um 270 v. Chr. bl.) Philosophen Bion Witz: Εὔκολον τὴν εἰς
Ἅιδου ὁδόν· καταμύοντας γοῦν κατιέναι, der Weg zum Hades ist
leicht; man kommt ja mit geschlossenen Augen hinab (s. Diog.
Laërt. IV, c. 7, n. 3, § 49) wird von uns in der kürzeren Form des
Vergil citiert (Aen. 6, 126):
Facilis descensus Averno,
Das Hinabsteigen in die Unterwelt ist leicht;
worauf dann folgt, dass das Wiederauftauchen daraus schwer sei.—
Philo Judaeus († 54 n. Chr.) sagt (de migr. Abrahami 15, p.
449, Mangey) von den ägyptischen Zauberern: ἀπατᾶν δοκοῦντες
ἀπατῶνται (sie glaubten zu betrügen und wurden betrogen).
Danach schreibt der gern citierende Apostel P a u l u s im 2. Briefe an
Thimotheus 3, 13 auch von den Magiern Ägyptens: Mit den bösen
Menschen aber und verführerischen wird es je länger je ärger,
verführen und werden verführt (πλανῶντες καὶ πλανώμενοι).
Dann sagt P o r p h y r i u s in seines Lehrers Plotin Leben (16): οἳ—
ἐξηπάτων καὶ αὐτοὶ ἠπατημένοι (die betrogen und selbst betrogen
waren) und A u g u s t i n u s (Bekenntnisse 7, 2): deceptos illos et
deceptores, und G. E. L e s s i n g (Nathan 3, 7) verdeutschte in
der Parabel von den drei Ringen das Wort also:
Betrogene Betrüger.
(vrgl. M a r g a r e t e v o n N a v a r r a in dem 1543 erschienenen Heptameron
Novelle 1, 6, 15, 23, 25, 28, 45, 51, 62; C a r d a n u s († 1576) De subtilitate,
1663, III, 551; C e r v a n t e s Don Quijote 2, 33 (1615) u. s. w.; M o s e s
M e n d e l s s o h n (Ges. Schr., 1843, III, 115; Brief vom 9. 2. 1770 an Bonnet
über eine Sekte): Wollen wir sagen, dass alle ihre Zeugen Betrogene und
Betrüger sind? Eine komische Oper von Guilet et Gaveaux (1799) heisst Le
trompeur trompé.)—
Flavius Josephus (37 n. Chr.—nach 93) sagt in seiner Schrift
Gegen Apion (II, 16) von Moses im Gegensatze zu Minos: Ὁ δὲ
ἡμετέρος νομοθέτης εἰς μὲν τούτων οὐδοτιοῦν ἀπεῖδεν, ὡς δ' ἄν τις
εἴποι βιασάμενος τὸν λόγον,
θεοκρατίαν
ἀπέδειξε τὸ πολίτευμα, Θεῷ τὴν ἀρχὴν καὶ τὸ κράτος
ἀναθείς—Unser Gesetzgeber richtete jedoch auf Alles Dieses gar
nicht sein Augenmerk; er machte die Staatsverfassung zu einer
Theokratie
(Gottesherrschaft), wenn man sich so gewaltsam ausdrücken darf,
indem er Gott die obrigkeitliche Macht beilegte.—
Einen Spruch des Epiktet (geb. um 50 n. Chr.) teilt A u l u s
G e l l i u s 17, 19, 6 in der lateinischen Form mit:
Sustine et abstine,
ἀνέχου καὶ ἀπέχου,
Leide und meide.—
Plutarch (geb. um 50 n. Chr., † 120 n. Chr.) erzählt in seiner
Biographie des L. A e m i l i u s P a u l l u s (Kap. 5), dass dieser sich
aus unbekannten Gründen von seiner Gattin, Papiria, habe scheiden
lassen. Plutarch vermutet, dass der Scheidungsgrund ein ähnlicher
gewesen sei, wie derjenige eines gewissen Römers. Dieser habe sein
Weib fortgeschickt und alsdann auf die Fragen seiner Freunde: Ist
sie denn nicht sittsam? Nicht schön von Gestalt? Schenkte sie Dir
denn keine Kinder? ihnen seinen Schuh hingestreckt und gefragt:
Ist er nicht fein? Ist er nicht neu? Aber Niemand von Euch sieht, an
welcher Stelle mein Fuss gedrückt wird, (οὐκ ἂν εἰδείη τὶς ὑμῶν. καθ'
ὅτι θλίβεται μέρος οὑμὸς πούς). Hierauf fusst die Stelle des
H i e r o n y m u s (adv. Jovin. 1, 48): Legimus quendam apud
Romanos nobilem, cum eum amici arguerent, quare uxorem
formosam et castam et divitem repudiasset, protendisse pedem et
dixisse eis: Et hic soccus, quem cernitis, videtur vobis novus et
elegans, sed nemo scit praeter me, u b i m e p r e m a t. Hier findet
sich zuerst das bekannte Bild unseres Sprachschatzes:
Nicht wissen und wissen, wo Einen der Schuh drückt.—
Durch Lucians (um 160 n. Chr.) Abhandlung wie man Geschichte
schreiben müsse wurde die thracische Stadt
Abdera
für immer als lächerlich gebrandmarkt; und sie wurde als solche in
Deutschland berühmt durch W i e l a n d s im teutschen Merkur
1774, 1. und 2. erschienene Geschichte der
Abderiten.—
Bei Sextus Empiricus (Ende des 2. Jahrh. n. Chr.; Adversus
mathematicos, 287; Imm. Bekker, Berl. 1842; S. 665) steht:
ὀψὲ θεῶν ἀλέουσι μύλοι, ἀλέουσι δὲ λεπτά.
Lange zwar mahlen die Mühlen der Götter, doch mahlen sie
Feinmehl. (Ähnlich in Orac. Sibyll. 8, 14. ed.
Friedlieb, Lpz. 1852.)
In Eiseleins Sprichwörtern wird das Wort ohne jeglichen Beleg auf
P l u t a r c h zurückgeführt. S e b a s t i a n F r a n c k (Sprichwörter,
1541, II, 119b
) führt an: Sero molunt deorum molae, Gottes Mühl
stehet oft lang still und die Götter mahlen oder scheren einen
langsam, aber wohl, ferner einige Zeilen weiter unten Der Götter
Mühl machen langsam Mehl, aber wohl, und L o g a u (1654) III, 2,
24 macht daraus:
Gottes Mühlen mahlen langsam, mahlen aber trefflich
klein.
(Ob aus Langmut er sich säumet, bringt mit Schärf er alles
ein.)
Daraus dürfte die bekannte Redensart: Langsam, aber sicher
entstanden sein.—
Plotin ( † 270 n. Chr.) bereichert unsere Sprache um zwei
geflügelte Worte. Wir lesen bei ihm (Enn. I, 6 p. 57; Ausg. v.
Kirchhoff I, S. 12): οὐ γὰρ πώποτε εἶδεν ὀφθαλμὸς ἥλιον, ἡλιοειδὴς
μὴ γεγενημένος, οὐδὲ τὸ καλὸν ἂν ἴδοι ψυχὴ μὴ καλὴ γενομένη,
Nie hätte das Auge je die Sonne gesehen, wäre es nicht selbst
sonnenhafter Natur; und wenn die Seele nicht schön ist, kann sie
das Schöne nicht sehen. Hieraus stammt
Schöne Seele
und der G o e t h esche Vers (1823. Zahme Xenien. Bd. 3):
Wär' nicht das Auge sonnenhaft,
Die Sonne könnt' es nie erblicken.
Mit diesem Gedanken lehnte P l o t i n sich an P l a t o an, der in
seinem Staat p. 508 sagt: Das Gesicht ist nicht die Sonne . . .
aber das sonnenähnlichste . . . unter allen Werkzeugen der
Wahrnehmung, und der ebenda weiter unten Erkenntnis und
Wahrheit, wie Licht und Gesicht, für sonnenartig erklärt.—
Julianus Apostata (331-363 n. Chr.) meint (oratio VI ed. Ez.
Spanhemius, 1696, p. 184), es dürfe nicht Wunder nehmen, dass
wir zu der, gleich der Wahrheit, einen und einzigen Philosophie auf
den verschiedensten Wegen gelangen. Denn auch wenn Einer nach
Athen reisen wolle, so könne er dahin segeln oder gehen und zwar
könne er als Wanderer die Heerstrassen benutzen oder die
Fusssteige und Richtwege und als Schiffer könne er die Küsten
entlang fahren oder wie Nestor das Meer durchschneiden. Damals
galt noch Athen als Ziel der Gebildeten, später wurde es Rom. Es
führen viele Wege nach Athen liegt im obigen Satz und mochte sich
in das uns geläufige Wort verwandeln:
Es führen viele Wege nach Rom,
wofür jedoch sichere Belege noch zu suchen sind.—
Proclus (412, 485 n. Chr.) nennt in seinem Commentar zu Platos
Timaeus (154c) den οὐρανός (Himmel) die
πέμπτη οὐσία
Quintessenz
(Das fünfte Seiende)
und auch in dem Leben des Aristoteles von A m m o n i u s
(Westermann, vitarum scriptores Graeci minores, 1845, p. 401)
wird die εʹ οὐσία erwähnt. Damit ist nach Aristoteles (De mundo,
Kap. 2) der Äther gemeint, der dort ein anderes Element als die
vier, ein göttliches, unvergängliches genannt wird. (Aristot.
Meteor. 1, 3; de coelo, 1, 3; de gen. an., 2, 3.) Proclus ist die
Quelle für das Wort. Viel später jedoch wurde der heut damit
verknüpfte Begriff des feinsten Extrakts, der innersten Wesenheit
oder des Kerns einer Sache in dies Wort hineingelegt.
R a i m u n d u s L u l l u s gab 1541 sein Buch De secretis naturae
sive Quinta essentia heraus, in dem er zu Anfang des zweiten Teiles
diese Quintessenz als Allheilmittel preist, und 1570 erschien
Leonhart T h u r n e y s s e r zum Thurns Quinta essentia, das ist die
höchste Subtilitet, Krafft und Wirkung . . . . der Medicina und
Alchemia . . . . In der Vorrede stellt er die Quinta Essentz Olea
neben den Stein der Weisen, den lapis philosophorum. Im 13.
Buch nennt er sich einen Schüler des Theophrastus P a r a c e l s u s,
der also der Vater des Schwindels mit der Quintessenz sein wird,
wie er so manchen anderen Schwindels Vater gewesen ist.—
XI.
Geflügelte Worte aus lateinischen
Schriftstellern.
[63]
[63] Aus diesem Kapitel (15. Aufl.) ging A. O t t o's Werk hervor: Die
Sprichwörter und sprichwörtlichen Redensarten der Römer (Lpzg., Teubner,
1890), eine vortreffliche Arbeit, der dieses Buch manchen wertvollen Aufschluss
verdankte.
Jeder ist seines Glückes Schmied
ist nach der dem S a l l u s t zugeschriebenen Schrift de republica
ordinanda 1, 1, wo es heisst: quod in carminibus Appius ait,
fabrum esse suae quemque fortunae, auf A p p i u s Claudius
(Consul 307 v. Chr.) zurückzuführen. P l a u t u s (Trin. 2, 2, 84:
sapiens ipse fingit fortunam sibi) schreibt diese Fähigkeit nur dem
Weisen zu; während ein von Cornelius N e p o s (Atticus 11, 6)
mitgeteilter Jambus eines Unbekannten wiederum aussagt:
Sui cuique mores fingunt fortunam (hominibus).
Jedes Menschen Glück schmiedet ihm sein Charakter.—
Als Citatenquelle ist Plautus (um 254-184 v. Chr.) zu erwähnen mit:
Nomen atque omen,
Name und zugleich Vorbedeutung,
aus dem Persa, 4, 4, 74, und mit dem ebenda 4, 7, 19
vorkommenden, von Te r e n z im Phormio 3, 3, 8 angewendeten
Sapienti sat (est)!
Für den Verständigen genug!
(d. h. für ihn bedarf es keiner weiteren Erklärung).—
Oleum et operam perdidi
Öl und Mühe habe ich verschwendet
kommt in des P l a u t u s Poenulus 1, 2, 119 vor und wird dort von
einer Dirne gebraucht, die sich vergebens hat putzen und salben
lassen. C i c e r o überträgt es auf Gladiatoren (Ad familiares 7, 1);
dann wird damit auf das verschwendete Öl der Studierlampe
angespielt (Cicero Ad Atticum 13, 38; Iuvenal 7, 99).—
Allgemein bekannt ist auch des P l a u t u s Komödientitel
Miles gloriosus
Der ruhmredige Kriegsmann.
Das Original dieses Stückes war von einem uns unbekannten
griechischen Dichter und hiess Ἀλαζών (der Marktschreier,
Aufschneider, Gloriosus), wie P l a u t u s (2, 1, 8 u. 9) selbst
bezeugt.—
Summa summarum,
Alles in allem,
finden wir zuerst bei P l a u t u s (Truculentus 1, 1, 4).—
Im Trinummus (5, 2, 30) des P l a u t u s heisst es:
Tunica propior pallio.
Das Hemd ist mir näher als der Rock.—
Bei P l a u t u s (Stichus 5, 4, 52 Casina 2, 3, 32) kommt
Ohe iam satis!
Oh, schon genug!
vor, das sich auch bei Horaz (Sat. 1, 5, 12) und M a r t i a l (4, 91, 6
u. 9) findet.—
Ennius (239-169 v. Chr.) wird in C i c e r o s Laelius 17, 64 citiert
mit:
Amicus certus in re incerta cernitur,
Den sicheren Freund erkennt man in unsicherer Sache.—
Schon E u r i p i d e s (Hec. 1226) sagt ähnlich:
Ἐν τοῖς κακοῖς γὰρ οἱ ἀγαθοὶ σαφέστατοι Φίλοι.
Denn in der Not sind gute Freund' am sichersten.—
In 1, 1, 99 der Andria des Terenz (185-155 v. Chr.) erzählt
Simo, wie er sich erst über des Sohnes Pamphilus Thränen beim
Begräbnis einer Nachbarin gefreut, dann aber der Verstorbenen
hübsche Schwester unter den Leidtragenden bemerkt habe . . . .
Das fiel mir gleich auf. Haha! Das ist's!
Hinc illae lacrumae!
Daher jene Thränen!
Dies Wort wird bereits von C i c e r o (pro Caelio, c. 25) und von
H o r a z (Epistel 1, 19, 41) citiert.—
Aus 1, 2, 23 der Andria des Te r e n z ist die Antwort des Davus:
Davus sum, non Oedipus,
Davus bin ich, nicht Ödipus,
d. h. ich verstehe dich nicht, denn ich kann nicht so geschickt
Rätsel lösen wie Ödipus.—
Aus der Andria 1, 3, 13:
Inceptio est amentium, haud amantium,
Ein Beginnen von Verdrehten ist's, nicht von Verliebten,
ist in den Gebrauch übergegangen:
Amantes, amentes,
Verliebt, verdreht,
was wohl zuerst in dem Titel des 1604 in 3. Auflage erschienenen
Lustspiels Amantes amentes von G a b r i e l R o l l e n h a g e n
vorkommt. Amens amansque (verdreht und verliebt) findet sich
übrigens schon bei P l a u t u s Merc. Prolog. 81.—
Aus der Andria 2, 1, 10 und 14 ist:
Tu si hic sis, aliter sentias,
Wärst du an meiner Stelle, du würdest anders denken;
Interim fit (eigentlich: fiet) aliquid;
Unterdessen wird sich schon irgend etwas ereignen;
(in des Plautus Mercator 2, 4, 24 heisst es: aliquid fiet).—
Aus 3, 3, 23 sind die Worte:
Amantium irae amoris integratio (est)
Der Liebenden Streit die Liebe erneut,
eine Verschönerung des Menandrischen ὀργὴ φιλούντων μικρὸν
ἰσχύει χρόνον, Nicht lange währt der Zorn der Liebenden (s.
Stobäus Serm. LXI, p. 386.11); aus 4, 1, 12:
proximus sum egomet mihi,
Jeder ist sich selbst der Nächste.—
Aus dem Eunuch (Prolog 41) des Te r e n z stammt:
Nullum est iam dictum, quod non sit dictum prius,
Es giebt kein Wort mehr; das nicht schon früher gesagt ist;
(s. Goethe: Wer kann was Dummes . . .)—
Aus 4, 5, 6 kommt uns das damals schon sprichwörtliche
Sine Cerere et Libero friget Venus
Ohne Ceres und Bacchus bleibt Venus kalt.
Bereits E u r i p i d e s sagte (Bacchae, 773):
οἴνου δὲ μηκέτ' ὄντος, οὐκ ἔστιν Κύπρις.
Wo's keinen Wein mehr giebt, giebt's keine Liebe.—
In des Te r e n z Heautontimorumenos (s. auch unter: Menander)
1, 1, 25 heisst es:
Homo sum; humani nihil a me alienum puto,
Mensch bin ich; nichts, was menschlich, acht' ich mir als
fremd.
Es liegt hier wohl zweifellos die Übersetzung eines, schon im
Menanderschen Original befindlich gewesenen Wortes vor.—
Aus des Te r e n z Adelphi 4, 1, 21 citieren wir den erschreckten
Ruf des Syrus, als er Ctesiphos Vater plötzlich erblickt, über den er
gerade mit jenem spricht:
Lupus in fabula!
(C i c e r o ad. Attic. 13, 33 wendet das Wort an, das schon bei
P l a u t u s Stich. 4, 1, 71 in der Form ecce tibi lupum in sermone
vorkommt.) Zu übersetzen wäre: Wenn man vom Wolf spricht, ist er
nicht weit; doch wollen andere Ausleger den Volksglauben der Alten
hineinziehen, dass man beim Anblick eines Wolfes verstummen
müsse (s. Voss z. Vergils Ecl. 9, 54 u. Meineke zu Theokrits Id. 14,
22), da ja auch die plötzliche Ankunft dessen, von dem wir reden,
uns verstummen mache.—
Adelphi 4, 7, 21-23 heisst es:
Ita vita est hominum, quasi, cum ludas tesseris;
Si illud, quod maxume opus est iactu, non cadit,
Illud quod cecidit forte, id arte ut corrigas.
So gleicht des Menschen Leben einem Würfelspiel:
Wenn just der Wurf, den man am meisten braucht nicht fällt,
So korrigiert man, was der Zufall gab, durch Kunst.
Aus dieser Stelle stammt
corriger la fortune
das Glück verbessern, d. h. falsch spielen, was sich in
H a m i l t o n s 1713 erschienenen Mém. d. Grammont K. 2, in
P r é v o s t s Manon Lescaut (1743) 27, 1 und auch in L e s s i n g s
Minna von Barnhelm (1767) 4, 2 findet.
M o l i è r e (1663 L'École des Femmes 4, 8) hat corriger le hazard beim
Würfelspiel, aber durch bonne conduite. In R e g n a r d s Le Joueur (1696) 1,
10 weiss Toutabas, wenn's sein muss, par un peu d'artifice d'un sort injurieux
corriger la malice; und in G. F u r q u h a r s Sir Harry W i l d a i r (1701) Akt 3 z.
A. sagt Monsieur Marquis in seinem Kauderwelsch: Fortune give de Anglis Man
de Riches, but Nature give de France Man de Politique to correct unequal
Distribution.—
Duo cum faciunt idem, non est idem,
Wenn zwei dasselbe thun, so ist es nicht dasselbe,
ist eine Verkürzung der Stelle Adelphi 5, 3, 37:
Duo cum idem faciunt, . . .,
Hoc licet impune facere huic, illi non licet.
Wenn zwei dasselbe thun, . . . so darf der Eine
es ungestraft thun, der Andere nicht.—
Aus des Te r e n z Phormio 1, 2, 18 stammt:
Montes auri pollicens;
Berge Goldes (goldene Berge) versprechen(d).
Wenn G e o r g E b e r s (Ägypten in Bild und Wort S. 17) den Komödiendichter
M e n a n d e r aus Athen an seine Geliebte schreiben lässt: Ich habe von
Ptolomäus . . . Briefe . . ., in denen er mir mit königlicher Freigebigkeit g o l d e n e
B e r g e verspricht, so ist dies nur eine freie Übersetzung von τῆς γῆς ἀγαθά, die
Güter der Erde. In des P l a u t u s Miles gloriosus 4, 2, 73 kommen aber schon
argenti montes, Berge von Silber, vor und im Stichus 1, 1, 24-5 heisst es:
Neque ille sibi mereat Persarum montes, qui esse aurei perhibentur, Und er
möchte sich die Perserberge nicht erwerben, die von Gold sein sollen. Auch
V a r r o (bei Nonius p. 379) singt von diesen Perserbergen:
Non demunt animis curas ac religiones
Persarum montes, non atria divitis Crassi;
Weder die Berge der Perser, noch Hallen des prunkenden Crassus
Können die Herzen befreien von Angst und von nagenden Skrupeln;
während der Perserkönig im A r i s t o p h a n e s (Acharn. 81) nach
achtmonatlichem Sitzen auf goldenen Bergen (ἐπὶ χρυσῶν ὀρῶν) eine Befreiung
anderer Art fand. Es scheint, als deute unser Gudrunepos (vor 1200) mit seinem
(V. 493) und waere ein berc golt, den naeme ich niht dar umbe auf eine
gemeinsame indogermanische Quelle.—
Aus des Te r e n z Phormio 2, 2, 4 ist:
Tute hoc intristi; tibi omne est exedendum,
Du hast es eingerührt; Du musst es auch ganz ausessen;
aus 2, 4, 14:
Quot homines, tot sententiae,
So viel Leute, so viel Ansichten,
was schon C i c e r o (De fin. 1, 5, 15) anführt, (vrgl. unten: Horaz
Sat. 2, 1, 27.)—
Oderint, dum metuant,
Mögen sie hassen, wenn sie nur fürchten,
aus der Tragödie Atreus des Accius (170-104 v. Chr.), citierten
bereits C i c e r o (1. Philipp. 14, 34, pr. Sest. 48, de offic. 1, 28)
und S e n e c a (Üb. d. Zorn 1, 20, 4; Üb. d. Gnade 1, 12, 4 u. 2,
2, 2). Nach S u e t o n (Calig. 30) war es ein Lieblingswort des
Kaisers Caligula.—
Bei Lucilius († 103 v. Chr.) steht (ed. Lachmann, Berl. 1877, v. 2,
ebenso bei P e r s i u s 1, 2):
Quis leget haec?
Wer wird das (Zeug) lesen?—
Auch stammt nach M a c r o b i u s (Saturnalien, 6, 1, 35)
non omnia possumus omnes
wir können nicht Alle Alles
von L u c i l i u s her und wurde von F u r i u s A n t i a s citiert.
V e r g i l verwendete es Ecloge 8, 63. H o m e r mag des Gedankens
Vater sein, denn, dass e i n e m Menschen nicht alle Gaben verliehen
seien, spricht er öfters aus (s. Iliade 4, 320; 13, 729 u. Odyssee
8, 167).—
Varro (116-27 v. Chr.) De lingua latina VII, 32 (n. Otfr. Müllers
Ausg.) sagt: Sed canes, quod latratu signum dant, ut signa canunt,
canes appellatae. Dies ist spöttisch umgestaltet worden zu:
canis a non canendo
Hund wird canis genannt, weil er nicht singt (non canit) (s.
Quintilians lucus a non lucendo).—
Auch citieren wir das von G e l l i u s (1, 22, 4 u. 13, 11, 1) als Titel
einer V a r r onischen Schrift angeführte:
Nescis, quid vesper serus vehat.
Du weisst nicht, was der späte Abend bringt.—
Cicero (106-43 v. Chr.) nennt pro Roscio Amerino, 29 die
Mordgesellen, die zu Sullas Zeiten Gutsbesitzer ermordeten und
dann deren Güter betrügerisch an sich zu bringen und vorteilhaft zu
verschachern wussten:
sectores collorum et bonorum,
Halsabschneider und Güterschlächter.—
Im Anfange der 1. Rede in Catilinam finden wir das auch bei Livius
6, 18 und bei Sallust Catilina 20, 9 vorkommende, ungeduldige
Quousque tandem . . .?
Wie lange noch . . .?—
In Ciceros Catilina 1, 1 (vrgl. Martial IX, 71); IV, 25, 56, sowie pro
rege Deiotaro 11, 31 und de domo sua 53, 137 steht:
O tempora! O mores!
O Zeiten! O Sitten!
Im Hofmeister (1774) von R. Lenz citiert es (5, 10) der Schulmeister
Wenzeslaus, und als Refrain von Geibels Lied vom Krokodil (1840) fand es die
weiteste Verbreitung.—
In C i c e r o s Catilina 2, 1 findet sich:
Abiit, excessit, evasit, erupit.
Er ging, er machte sich fort, er entschlüpfte, er entrann.—
Videant consules ne quid res publica detrimenti
capiat,
Die Konsuln mögen dafür sorgen, dass die Republik keinen
Schaden leidet
bildete, seit man vom 6. Jahrh. an die Diktatur nicht mehr in Rom
anwenden wollte, das sogenannte senatus-consultum ultimum,
welches die Konsulargewalt zu einer diktatorischen machte (s.
C i c e r o pr. Mil. 26, 70, in Catil. I, 2, 4, Phil. 5, 12, 34, Fam.
16, 11, 3; C ä s a r de bell. civ. 1, 5, 3; 1, 7, 4; Liv. 3, 4, S a l l u s t
Catil. 29, P l u t a r c h C. Gracch. 14 u. Cic. 15.)—
Aus C i c e r o s de fin. 5, 25, 74 stammt:
Consuetudo (quasi) altera natura,
Die Gewohnheit ist (gleichsam) eine zweite Natur;
G a l e n u s (De tuenda valetudine, cap. 1) bietet die heute übliche
Form: Consuetudo est altera natura. Schon in des A r i s t o t e l e s
Rhetorik, 1370a 6 (Bekker) heisst es: die Gewohnheit ist der
Natur gewissermassen ähnlich (τὸ εἰθισμένον ὥσπερ πεφυκὸς ἤδη
γίγνεται).—
In C i c e r o s Tuscul. 1, 17, 39 heisst es:
Errare . . malo cum Platone, . . quam cum istis vera
sentire,
Lieber will ich mit Plato irren, als mit denen (den
Pythagoreern) das Wahre denken.—
Di minorum gentium
(wörtlich: Götter aus den geringeren Geschlechtern) nennen wir
die untergeordnete Schicht einer Klasse Menschen mit Beziehung auf
das maiorum gentium di (d. h. die oberen zwölf Götter bei
C i c e r o Tusc. 1, 13, 29), Bezeichnungen, die daraus entsprangen,
dass Tarquinius ausser den von Romulus berufenen patres maiorum
gentium (Senatoren aus den hervorragenden Geschlechtern) auch
patres minorum gentium (Senatoren geringerer Herkunft) berief
(vrgl. Cicero d. rep. 2, 20; Liv. 1, 35, 6 und dazu das Patrici
minorum gentium bei Cic. Fam. 9, 21 und Liv. 1, 47, 7).—
Aus C i c e r o s I. Philippica, 5, 11 und zugleich aus De finibus 4,
9, 22, (vrgl. Livius 23, 16 im Anfang, wo es in nicht übertragener
Bedeutung steht) stammt die für eine den Staat bedrohende Gefahr
gebräuchlich gewordene Wendung:
Hannibal ad (nicht: ante) portas.
Hannibal (ist) an den Thoren.
Diese Redensart, wie die Erinnerung an Catilina und an das aus
L i v i u s (XXI, 7: dum ea Romani parant consultantque, iam
Saguntum summa vi oppugnabatur) geschöpfte Wort:
Dum Roma deliberat, Saguntum perit,
Während Rom beratschlagt, geht Sagunt zu Grunde,
(auch in der Form:
Roma deliberante Saguntum perit
citiert) wurden von G o u p i l d e P r é f e l n in einer Sitzung der
konstituierenden Versammlung von 1789 zu dem unrichtigen Citate
vermischt:
Catilina est aux portes, et l'on délibère.
Er stichelte damit auf M i r a b e a u, der diesem Worte dadurch erst
recht Bahn verschaffte, dass er es in seiner berühmten Rede zur
Abwendung des Bankerotts wiederholte und variirte.—
In C i c e r o s II. Philippica 14, 35, pro Milone 12, 32 und pro
Roscio Amerino 30, 84 und 31, 86 wird das uns geläufige
cui bono?
(Wozu?)
(A quoi bon?)
eigentlich: Wem zum Nutzen? ausdrücklich als ein Wort des L.
Cassius bezeichnet. Aus der zuletzt angeführten Stelle ersehen wir,
dass L. Cassius, ein Mann von äusserster Strenge, bei den
Untersuchungen über Mord den Richtern einschärfte,
nachzuforschen, cui bono, wem zum Nutzen das Ableben des
Ermordeten war.—
Cicero spricht in seiner Rede pro Roscio Amer. 16, 47: Homines
notos sumere odiosum est, cum et illud incertum sit, velintne hi sese
nominari (angesehene Leute nennen, ist eine heikle Sache, da es
auch zweifelhaft ist, ob sie selbst genannt werden wollen). Daher
sagen wir, wenn es gescheidter ist, keine Namen zu nennen:
Nomina sunt odiosa,
Namen sind verpönt.—
Aus C i c e r o s Rede pro Milone 4, 10 ist bekannt:
Silent leges inter arma.
Im Waffenlärm schweigen die Gesetze.
L u c a n u s ahmt diese Worte (Pharsalia I, 277) also nach: Leges
bello siluere coactae.—
Die altrömische Formel des Richters, der nicht entscheiden kann, ob
Schuld oder Unschuld vorliegt, das
Non liquet
citieren wir aus Cicero pro Cluentio 28, 76 (vrgl. Gellius 14, 2. g. E.
und das liquet bei Cicero Caecin. 10; Quintilian Instit. 3, 6, 12):
Deinde homines sapientes, et ex vetere illa disciplina iudiciorum, qui
neque absolvere hominem nocentissimum possent, neque eum, de
quo esset orta suspicio, pecunia oppugnatum, re illa incognita, primo
condemnare vellent, n o n l i q u e r e dixerunt. Darauf gaben
einsichtige Männer von der alten Schule der Geschwornengerichte,
die weder solchen Verbrecher freisprechen konnten, noch ihn, gegen
Den, wie man munkelte, mit Bestechung der Richter vorgegangen
war, v o r Untersuchung dieser Sache im ersten Termin verurteilen
wollten, folgenden Spruch ab: e s i s t n i c h t a u f g e k l ä r t.—
Weil C i c e r o seine Reden gegen Antonius im Vergleich mit den
gewaltigen Reden des D e m o s t h e n e s gegen Philipp von
Macedonien Philippische nannte, so nennt man noch heute jede
Donnerrede eine
Philippika.—
Der Titel der C i c e r onischen Rede de domo sua ist in der älteren
Lesart
pro domo
für das eigene Haus
zum allgemeinen Ausdruck für jede Thätigkeit geworden, die auf
Erhaltung der eigenen Habe abzielt, und wir nennen danach eine der
Selbstverteidigung oder dem eigenen Vorteil dienende Rede eine
oratio pro domo.—
Aus C i c e r o s (De harusp. respons. 20, 43) Redewendung:
resistentem, longius, quam voluit, popularis aura provexit, Die
Volksgunst trieb den Widerstrebenden weiter, als er wollte, stammt
das später von Vergil, Horaz, Livius und Quintilian ähnlich
angewandte Wort:
aura popularis,
Hauch der Volksgunst.—
Suum cuique
(Jedem das Seine)
finden wir bei C i c e r o de offic. 1, 5; de natur. deor. 3, 15, 38;
de leg. 1, 6, 19; (vrgl. Ta c i t u s: Annalen, 4, 35, P l i n i u s:
Natur. hist. 14, 6, 8 und den ähnlichen Gedanken bei T h e o g n i s
332 u. 546).
De finibus 5, 23, 67 sagt C i c e r o: Iustitia in suo cuique tribuendo cernitur,
Die Gerechtigkeit erkennt man daran, dass sie Jedem das Seine zuerteilt; und
suum cuique tribuere ist eine Rechtsregel U l p i a n s (Corp. iur. civ. Digest. I,
1 de iustitia et iure § 10); daher es in S h a k e s p e a r e s Andronicus 1, 2
heisst: Suum cuique spricht des Römers Recht. Friedrich I. von Preussen wählte
das Suum cuique zur Inschrift vieler Medaillen und Münzen und zum Motto des
am 17. Januar 1701 gestifteten Ordens vom schwarzen Adler, und seitdem blieb es
Preussens Wahlspruch.—
Das von C i c e r o de offic. 1, 10, 33 als abgedroschenes
Sprichwort citierte
Summum ius, summa iniuria
Das höchste Recht (ist) das höchste Unrecht
scheint eine spätere Fassung des Sprichwortes in des Te r e n z
Heautontimorumenos 4, 5 zu sein:
Dicunt: ius summum saepe summa est malitia.
Man pflegt zu sagen: Das höchste Recht ist oft die höchste
Bosheit.
L u t h e r 21, 254 schreibt: Wie der Heide Terentius sagt: 'Das strengest Recht ist
das allergrossest Unrecht'. (23, 295 führt Luther das Wort auf S c i p i o zurück.)—
Aus C i c e r o s de offic. 1, 16, 52, wo es sich um allgemeine
Gefälligkeiten gegen Jedermann handelt, wie z. B. dass wir es Jedem
gestatten müssen, sich an unserem Feuer das seinige anzuzünden,
citieren rauchende Gelehrte, um Feuer bittend:
Ab igne ignem.
Vom Feuer Feuer.—
De offic. 1, 22, 77 enthält den von C i c e r o selbst verfertigten
Vers:
Cedant arma togae, concedat laurea laudi,
Es mögen die Waffen der Toga, d. h. dem Friedensgewande,
nachstehen, der Lorbeer der löblichen That,
worüber er sich in der Rede in Pisonem 29 und 30 eines Weiteren
auslässt, während er nur cedant arma togae in der 2. Philippica 8
schreibt.—
Aus de offic. 1, 31, 110 kennen wir das schon hier von C i c e r o als
Sprichwort citierte, in ad familiares 3, 1 und 12, 25 wieder
vorkommende und von H o r a z in der Kunst zu dichten, 385,
angewendete
Invita Minerva;
Wider den Willen der Minerva;
aus de offic. 3, 1, 3:
ex malis eligere minima;
von zwei Übeln das kleinere wählen;
minima de malis war nach 3, 29, 105 sprichwörtlich.—
Aus C i c e r o s de offic. 3, 33, 117 (sed aqua haeret, ut aiunt) und
aus ad Quintum fratrem 2, 8 (in hac causa mihi aqua haeret)
stammt:
Hic haeret aqua,
Hier stockt es.—
Aus C i c e r o de legibus 3, 3, 8 citieren viele:
(his) salus populi suprema lex (esto),
Für diese (nämlich für die Regierenden) sei das Wohl des
Volkes das vornehmste Gebot.—
In de finibus 2, 32, 105 führt C i c e r o als Sprichwort an:
Iucundi acti labores;
Angenehm (sind) die gethanen Arbeiten;
und er fügt hinzu, auch E u r i p i d e s sage nicht übel:
Suavis laborum est praeteritorum memoria, was in dessen
Andromeda (nach Stobaeus: Florib. 29, 57) also lautete: Ἀλλ'
ἡδύ τοι σωθέντα μεμνῆσθαι πόνων.—
Aus C i c e r o s de natur. deor. 3, 40 citieren wir:
Pro aris et focis (certamen);
(Kampf) um Altar und häuslichen Herd.—
In pro Milone 29, 79 sagt C i c e r o: Liberae sunt nostrae
cogitationes (Frei sind unsere Gedanken), und L. 48 der Digesten
19, 18 heisst es aus U l p i a n s lib. III ad Edictum: Cogitationis
poenam nemo patitur (Für seinen Gedanken wird niemand
bestraft). Das ist umgewandelt worden zu dem sprichwörtlichen:
Gedanken sind zollfrei,
was sich wohl zuerst bei L u t h e r (Von weltlicher Oberkeit, wie man
ihr Gehorsam schuldig sei. 1523) findet.—
Aus C i c e r o s pro Sestio cap. 45 stammt:
Otium cum dignitate,
Musse mit Würde,
oder, wie dort steht: cum dignitate otium. Der Sinn ist: behagliche
Ruhe, verbunden mit einer angesehenen Stellung. Auch im Anfange
der Schrift de oratore ist es zu finden und in Ciceros Briefen ad.
famil. 1, 9, 21 wird es als ein häufig von ihm angewendetes Wort
erwähnt.—
In diesen Briefen C i c e r o s ad famil. 5, 12 steht:
Epistola non erubescit,
Ein Brief errötet nicht,
häufig umgestellt in:
Literae non erubescunt,
auch in:
Charta non erubescit.—
Imperium et libertas[64]
Herrschaft und Freiheit
stammt aus C i c e r o s 4. Rede gegen Catilina, IX, 19, wo er dem
Senat zuruft: Bedenket, wie in einer Nacht die so mühsam
befestigte Herrschaft (quantis laboribus fundatum i m p e r i u m) und
die so trefflich begründete Freiheit (quanta virtute stabilitam
l i b e r t a t e m) fast zu Grunde ging! Die Rede schliesst mit der
Forderung, dass der Senat über die Herrschaft und die Freiheit
Italiens (de i m p e r i o, de l i b e r t a t e Italiae) die Entscheidung
treffen möge.—
[64] L o r d B e a c o n s f i e l d (Disraeli) sagte in einer Rede beim Lord-Mayors-
Mahl am 10. Nov. 1879: Einer der grössten Römer wurde nach seiner Politik
gefragt. Er antwortete: imperium et libertas. Die Nationalzeitung vom 28. Nov.
1879 (Morgen-Ausg.) teilte mit, dass auf ihre Anfrage bei dem Lord die Antwort
erfolgt sei, die Quelle der citierten Worte fände sich im 1. Buche von B a c o n s
Advancement of Learning. (Ausg. Spedding, Ellis und Heath, vol. III, p. 303.)
Bacon übersetzt daselbst das in des Ta c i t u s Agricola 3 vorkommende
principatum ac libertatem, wofür er imperium et libertatem schreibt, mit:
government and liberty. Dass ein nach seiner Politik gefragter grosser Römer
diese Aussage gethan habe, ist also ein Irrtum.
Ut sementem feceris, ita metes
Wie du gesäet, so wirst du ernten,
dies Wort des M. Pinarius Rufus steht bei C i c e r o de oratore, 2,
65, 261. Ihm mochte des A r i s t o t e l e s Satz (Rhetor. 3, 3)
vorschweben: σὺ δὲ ταῦτα αἰσχρῶς μὲν ἔσπειρας, κακῶς δὲ
ἐθέρισας, was du hier böse gesäet, das hast du schlimm geerntet.
(vrgl. in der Vulgata Hiob 4, 8: et seminant dolores et metunt eos,
nach Luther: Die da Mühe pflügten und Unglück säeten, ernteten
sie auch ein. Galater 6, 8: Quae enim seminaverit homo, haec et
metet, nach Luther Gal. 6, 7: Denn was der Mensch säet, das wird
er ernten, dann Sprüche Sal. 22, 8; 2. Cor. 9, 6 und Gefl. Worte a.
d. Bibel Hosea 8, 7.)—
Aus einigen Hexametern des Julius Cäsar (100-44 v. Chr.) über
Terenz, die in dessen Biographie von S u e t o n (p. 294, 35, ed. Roth)
enthalten sind, hat man vermittelst eines falsch gesetzten Kommas
die Bezeichnung
vis comica
Kraft der Komik
herausgelesen. Die betreffenden Verse heissen:
Lenibus atque utinam scriptis adiuncta foret vis,
Comica ut aequato virtus polleret honore
Cum Graecis;
Wenn sich doch Kraft dir zu deinem gefälligen Dichten
gesellte,
Dass dein Wort in der Komik die nämliche Geltung erreiche,
Wie sie die Griechen besitzen!
Es ist in ihnen daher von einer virtus comica, nicht aber von einer
vis comica die Rede. (Klein. Schrift, in latein. u. deutscher
Sprache von Fr. Aug. W o l f, herausg. von G. Bernhardy, II, p. 728).
—
Aus Lucretius (98-55 v. Chr.) Über die Natur ist 1, 102:
Tantum religio potuit suadere malorum.
Zu so verderblicher That vermochte der Glaube zu raten.—
Aus 1, 149; 1, 205; 2, 287 wird citiert:
De nihilo nihil,
Aus Nichts wird Nichts,
was P e r s i u s (Satiren 3, 84) wiederholt. L u c r e t i u s hatte seine
Ansicht aus E p i k u r entlehnt, der (nach Diog. Laërtius 10, n. 24,
38) an die Spitze seiner Physik den Grundsatz stellte: οὐδὲν γίνεται
ἐκ τοῦ μὴ ὄντος, Nichts wird aus dem Nichtseienden. Vor Epikur
hatte schon M e l i s s u s gesagt, dass aus Nichtseiendem nichts
werden kann (Ü b e r w e g Geschichte der Philosophie des
Altertums, 1, S. 63), wie auch E m p e d o k l e s die Ansicht
bekämpft, dass Etwas, was vorher nicht war, entstehen könne
(ebenda 1, S. 66). A r i s t o t e l e s (Physik 1, 4) sagt,
A n a x a g o r a s habe die übliche Ansicht der Philosophen für wahr
gehalten, dass aus dem Nichtseienden Nichts entstünde (οὐ
γινομένου οὐδενὸς ἐκ τοῦ μὴ ὄντος). In M a r k A u r e l s (121-180
n. Chr.) Selbstbetrachtungen 4, 4 heisst es: denn von Nichts
kommt Nichts, so wenig als Etwas in das Nichts übergeht.—
Aus 2, 1 und 1 ist berühmt:
Suave, mari magno, turbantibus aequora ventis,
E terra magnum alterius spectare laborem.
Bei der gewaltigsten See, bei Wogen aufwühlenden Winden
Anderer grosses Bemüh'n vom Land aus seh'n, ist behaglich.
—
Aus Sallusts (86-35 v. Chr.) Jugurtha 10 ist:
concordia parvae res crescunt, discordia maximae
dilabuntur.
Durch Eintracht wächst das Kleine, durch Zwietracht zerfällt
das Grösste.—
Aus dem 187. Spruch des Publilius Syrus (bl. um 50 v. Chr.):
Heredis fletus sub persona risus est,
Das Weinen des Erben ist ein maskiertes Lachen,
oder aus den sogenannten Varronischen Sentenzen (12): sic flet heres, ut
puella nupta viro; utriusque fletus non apparens risus, Ein Erbe weint wie eine
Braut; Beider Weinen ist heimliches Lachen (vrgl. auch Horaz Sat. 2, 5, 100-
104)
scheint:
Lachende Erben
hervorgegangen zu sein. Schon 1622 kommt in Baden ein
Lacherbengeld vor (vrgl. Rau: Grundsätze der
Finanzwissenschaft, 5. Ausgabe 1864; § 237, S. 371 Anm. a) und
Friedrich v o n L o g a u schreibt (Salomons von Golau Deutscher
Sinn-Getichte Drey Tausend. Breslau. In Verlegung Caspar
Klossmanns. 1654, jedoch ohne Jahresangabe erschienen. Zweite
Zugabe zum 3. Tausend unter wehrendem Druck eingetroffen
No. 78 u. 79):
Lachende Erben.
Wann Erben reicher Leute die Augen wässrig machen
Sind solcher Leute Thränen nur Thränen von dem Lachen.
* * *
Die Römer brauchten Weiber, die weinten für das Geld;
Obs nicht mit manchem Erben sich ebenso verhält?
Dann heisst es in O t h o s Evangelischem Krankentrost (1664), S.
1034: Freu' dich, liebes Mütlein; traure, schwarzes Hütlein, heisst's
bei lachenden Erben.—
Die 245. Sentenz des P u b l i l i u s S y r u s:
Inopi beneficium bis dat qui dat celeriter
Dem Armen giebt eine doppelte Wohlthat, wer schnell giebt,
wird verkürzt zu:
Bis dat qui cito dat
Doppelt giebt, wer gleich giebt.—
Vergil (70 v.-19 n. Chr.) bietet Eclogen 1, 6, die manchmal als
Hausinschrift verwendeten Worte des behaglich gelagerten Hirten
Tityrus:
Deus nobis haec otia fecit,
Ein Gott hat uns diese Musse geschaffen.
Ecl. 2, 1:
Formosum pastor Corydon ardebat Alexin,
Corydon glühte, der Hirt, für die schöne Gestalt des Alexis
ist namentlich durch die verdrehte Übersetzung:
Der Pastor Corydon briet einen wunderschönen Hering
bekannt, die Christian W e i s e in seiner vom 27. Sept. 1692
datierten Vorrede zu Z i n c g r e f s Apophthegmata (Frankf. u.
Leipz. 1693) erwähnt.
Ecl. 2, 65 sagt Corydon von seiner Liebe:
Trahit sua quemque voluptas.
Jeden reisst seine Leidenschaft hin.
Ecl. 3, 93 warnt Damoetas die Blumen und Erdbeeren pflückenden
Knaben:
Latet anguis in herba,
Die Schlange lauert im Grase
(vrgl. Georgica 4, 457-459).—Ecl. 3, 104 fordert Damoetas den
Menalcas auf, ihm zu sagen, in welcher Gegend der Himmel nur drei
Klafter breit sei, und, fügt er hinzu, wenn Du darauf antworten
kannst,
eris mihi magnus Apollo,
dann wirst Du für mich gross wie Apoll sein.
Danach pflegt man Fragen, deren Beantwortung man nicht erwartet,
mit diesem Spruche zu begleiten.—
Ecl. 3, 108 heisst es:
Non nostrum tantas componere lites,
Nicht unseres Amtes ist's, solchen Streit beizulegen;
Ecl. 3, 111:
Claudite iam rivos, pueri; sat prata biberant.
Schliess't nun die Rinnen, ihr Knechte! genugsam getränkt
sind die Wiesen.
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

Weitere ähnliche Inhalte

PDF
Data Mining Practical Machine Learning Tools and Techniques 2nd Edition Ian H...
PDF
Instant ebooks textbook Data Mining 3rd ed Edition Jiawei Han download all ch...
PDF
Skylines and Other Dominance Based Queries 2nd Edition Apostolos N. Papadopoulos
PDF
Data Analytics and Learning Proceedings of DAL 2018 P. Nagabhushan
PDF
Multivariate Approximation and Applications 1st Edition N. Dyn
PDF
Azure Storage Streaming and Batch Analytics A Guide for Data Engineers 1st Ed...
PDF
Download full ebook of Space Data Management Agostino Cortesi instant downloa...
PDF
Clustering in Bioinformatics and Drug Discovery 1st Edition John David Maccui...
Data Mining Practical Machine Learning Tools and Techniques 2nd Edition Ian H...
Instant ebooks textbook Data Mining 3rd ed Edition Jiawei Han download all ch...
Skylines and Other Dominance Based Queries 2nd Edition Apostolos N. Papadopoulos
Data Analytics and Learning Proceedings of DAL 2018 P. Nagabhushan
Multivariate Approximation and Applications 1st Edition N. Dyn
Azure Storage Streaming and Batch Analytics A Guide for Data Engineers 1st Ed...
Download full ebook of Space Data Management Agostino Cortesi instant downloa...
Clustering in Bioinformatics and Drug Discovery 1st Edition John David Maccui...
Anzeige

Data Mining In Time Series Databases Mark Last Abraham Kandel

  • 1. Data Mining In Time Series Databases Mark Last Abraham Kandel download https://ebookbell.com/product/data-mining-in-time-series- databases-mark-last-abraham-kandel-918848 Explore and download more ebooks at ebookbell.com
  • 2. Here are some recommended products that we believe you will be interested in. You can click the link to download. Data Mining In The Dark Darknet Intelligence Automation Brian Nafziger https://ebookbell.com/product/data-mining-in-the-dark-darknet- intelligence-automation-brian-nafziger-49473646 Data Mining In Grid Computing Environments 1st Edition Werner Dubitzky Editor https://ebookbell.com/product/data-mining-in-grid-computing- environments-1st-edition-werner-dubitzky-editor-2385300 Data Mining In Proteomics From Standards To Applications 1st Edition Michael Hamacher https://ebookbell.com/product/data-mining-in-proteomics-from- standards-to-applications-1st-edition-michael-hamacher-2448462 Data Mining In Finance Advances In Relational And Hybrid Methods 1st Edition Boris Kovalerchuk https://ebookbell.com/product/data-mining-in-finance-advances-in- relational-and-hybrid-methods-1st-edition-boris-kovalerchuk-4199788
  • 3. Data Mining In Crystallography 1st Edition Joannis Apostolakis Auth https://ebookbell.com/product/data-mining-in-crystallography-1st- edition-joannis-apostolakis-auth-4205450 Data Mining In Bioinformatics 1st Edition Jason T L Wang Mohammed J Zaki https://ebookbell.com/product/data-mining-in-bioinformatics-1st- edition-jason-t-l-wang-mohammed-j-zaki-4238968 Data Mining In Large Sets Of Complex Data 1st Edition Robson L F Cordeiro https://ebookbell.com/product/data-mining-in-large-sets-of-complex- data-1st-edition-robson-l-f-cordeiro-4241700 Data Mining In Structural Biology Signal Transduction And Beyond 1st Edition Ch Heldin Auth https://ebookbell.com/product/data-mining-in-structural-biology- signal-transduction-and-beyond-1st-edition-ch-heldin-auth-4284820 Data Mining In Biomedical Imaging Signaling And Systems Sumeet Dua Ed https://ebookbell.com/product/data-mining-in-biomedical-imaging- signaling-and-systems-sumeet-dua-ed-4421560
  • 6. DATA MINING IN TIME SERIES DATABASES
  • 7. SERIES IN MACHINE PERCEPTION AND ARTIFICIAL INTELLIGENCE* Editors: H. Bunke (Univ. Bern, Switzerland) P. S. P. Wang (Northeastern Univ., USA) Vol. 43: Agent Engineering (Eds. Jiming Liu, Ning Zhong, Yuan Y. Tang and Patrick S. P. Wang) Vol. 44: Multispectral Image Processing and Pattern Recognition (Eds. J. Shen, P. S. P. Wang and T. Zhang) Vol. 45: Hidden Markov Models: Applications in Computer Vision (Eds. H. Bunke and T. Caelli) Vol. 46: Syntactic Pattern Recognition for Seismic Oil Exploration (K. Y. Huang) Vol. 47: Hybrid Methods in Pattern Recognition (Eds. H. Bunke and A. Kandel) Vol. 48: Multimodal Interface for Human-Machine Communications (Eds. P. C. Yuen, Y. Y. Tang and P. S. P. Wang) Vol. 49: Neural Networks and Systolic Array Design (Eds. D. Zhang and S. K. Pal) Vol. 50: Empirical Evaluation Methods in Computer Vision (Eds. H. I. Christensen and P. J. Phillips) Vol. 51: Automatic Diatom Identification (Eds. H. du Buf and M. M. Bayer) Vol. 52: Advances in Image Processing and Understanding A Festschrift for Thomas S. Huwang (Eds. A. C. Bovik, C. W. Chen and D. Goldgof) Vol. 53: Soft Computing Approach to Pattern Recognition and Image Processing (Eds. A. Ghosh and S. K. Pal) Vol. 54: Fundamentals of Robotics — Linking Perception to Action (M. Xie) Vol. 55: Web Document Analysis: Challenges and Opportunities (Eds. A. Antonacopoulos and J. Hu) Vol. 56: Artificial Intelligence Methods in Software Testing (Eds. M. Last, A. Kandel and H. Bunke) Vol. 57: Data Mining in Time Series Databases (Eds. M. Last, A. Kandel and H. Bunke) Vol. 58: Computational Web Intelligence: Intelligent Technology for Web Applications (Eds. Y. Zhang, A. Kandel, T. Y. Lin and Y. Yao) Vol. 59: Fuzzy Neural Network Theory and Application (P. Liu and H. Li) *For the complete list of titles in this series, please write to the Publisher.
  • 8. Series in Machine Perception and Artificial Intelligence -Vol, 57 DATA MINING IN TIME SERIES DATABASES Editors Mark Last Ben-Gurion LIniversity o f the Negeu,Israel Abraham Kandel Zl-Auiv University, Israel University of South Florida, Tampa, LISA Horst Bunke University of Bern, Switzerland vpWorld Scientific N E W J E R S E Y * L O N D O N * S I N G A P O R E B E l J l N G S H A N G H A I H O N G K O N G TAIPEI C H E N N A I
  • 9. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher. ISBN 981-238-290-9 Typeset by Stallion Press Email: enquiries@stallionpress.com All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher. Copyright © 2004 by World Scientific Publishing Co. Pte. Ltd. Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE Printed in Singapore by World Scientific Printers (S) Pte Ltd DATA MINING IN TIME SERIES DATABASES Series in Machine Perception and Artificial Intelligence (Vol. 57)
  • 10. Dedicated to The Honorable Congressman C. W. Bill Young House of Representatives For his vision and continuous support in creating the National Institute for Systems Test and Productivity at the Computer Science and Engineering Department, University of South Florida
  • 12. Preface Traditional data mining methods are designed to deal with “static” databases, i.e. databases where the ordering of records (or other database objects) has nothing to do with the patterns of interest. Though the assump- tion of order irrelevance may be sufficiently accurate in some applications, there are certainly many other cases, where sequential information, such as a time-stamp associated with every record, can significantly enhance our knowledge about the mined data. One example is a series of stock values: a specific closing price recorded yesterday has a completely different mean- ing than the same value a year ago. Since most today’s databases already include temporal data in the form of “date created”, “date modified”, and other time-related fields, the only problem is how to exploit this valuable information to our benefit. In other words, the question we are currently facing is: How to mine time series data? The purpose of this volume is to present some recent advances in pre- processing, mining, and interpretation of temporal data that is stored by modern information systems. Adding the time dimension to a database produces a Time Series Database (TSDB) and introduces new aspects and challenges to the tasks of data mining and knowledge discovery. These new challenges include: finding the most efficient representation of time series data, measuring similarity of time series, detecting change points in time series, and time series classification and clustering. Some of these problems have been treated in the past by experts in time series analysis. However, statistical methods of time series analysis are focused on sequences of values representing a single numeric variable (e.g., price of a specific stock). In a real-world database, a time-stamped record may include several numerical and nominal attributes, which may depend not only on the time dimension but also on each other. To make the data mining task even more com- plicated, the objects in a time series may represent some complex graph structures rather than vectors of feature-values. vii
  • 13. viii Preface Our book covers the state-of-the-art research in several areas of time series data mining. Specific problems challenged by the authors of this volume are as follows. Representation of Time Series. Efficient and effective representation of time series is a key to successful discovery of time-related patterns. The most frequently used representation of single-variable time series is piecewise linear approximation, where the original points are reduced to a set of straight lines (“segments”). Chapter 1 by Eamonn Keogh, Selina Chu, David Hart, and Michael Pazzani provides an extensive and compar- ative overview of existing techniques for time series segmentation. In the view of shortcomings of existing approaches, the same chapter introduces an improved segmentation algorithm called SWAB (Sliding Window and Bottom-up). Indexing and Retrieval of Time Series. Since each time series is char- acterized by a large, potentially unlimited number of points, finding two identical time series for any phenomenon is hopeless. Thus, researchers have been looking for sets of similar data sequences that differ only slightly from each other. The problem of retrieving similar series arises in many areas such as marketing and stock data analysis, meteorological studies, and medical diagnosis. An overview of current methods for efficient retrieval of time series is presented in Chapter 2 by Magnus Lie Hetland. Chapter 3 (by Eugene Fink and Kevin B. Pratt) presents a new method for fast compres- sion and indexing of time series. A robust similarity measure for retrieval of noisy time series is described and evaluated by Michail Vlachos, Dimitrios Gunopulos, and Gautam Das in Chapter 4. Change Detection in Time Series. The problem of change point detec- tion in a sequence of values has been studied in the past, especially in the context of time series segmentation (see above). However, the nature of real-world time series may be much more complex, involving multivariate and even graph data. Chapter 5 (by Gil Zeira, Oded Maimon, Mark Last, and Lior Rokach) covers the problem of change detection in a classification model induced by a data mining algorithm from time series data. A change detection procedure for detecting abnormal events in time series of graphs is presented by Horst Bunke and Miro Kraetzl in Chapter 6. The procedure is applied to abnormal event detection in a computer network. Classification of Time Series. Rather than partitioning a time series into segments, one can see each time series, or any other sequence of data points, as a single object. Classification and clustering of such complex
  • 14. Preface ix “objects” may be particularly beneficial for the areas of process con- trol, intrusion detection, and character recognition. In Chapter 7, Carlos J. Alonso González and Juan J. Rodrı́guez Diez present a new method for early classification of multivariate time series. Their method is capable of learning from series of variable length and able of providing a classification when only part of the series is presented to the classifier. A novel concept of representing time series by median strings (see Chapter 8, by Xiaoyi Jiang, Horst Bunke, and Janos Csirik) opens new opportunities for applying clas- sification and clustering methods of data mining to sequential data. As indicated above, the area of mining time series databases still includes many unexplored and insufficiently explored issues. Specific sug- gestions for future research can be found in individual chapters. In general, we believe that interesting and useful results can be obtained by applying the methods described in this book to real-world sets of sequential data. Acknowledgments The preparation of this volume was partially supported by the National Institute for Systems Test and Productivity at the University of South Florida under U.S. Space and Naval Warfare Systems Command grant num- ber N00039-01-1-2248. We also would like to acknowledge the generous support and cooperation of: Ben-Gurion University of the Negev, Department of Information Sys- tems Engineering, University of South Florida, Department of Computer Science and Engineering, Tel-Aviv University, College of Engineering, The Fulbright Foundation, The US-Israel Educational Foundation. January 2004 Mark Last Abraham Kandel Horst Bunke
  • 16. Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Chapter 1 Segmenting Time Series: A Survey and Novel Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 E. Keogh, S. Chu, D. Hart and M. Pazzani Chapter 2 A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences. . . . . . . . . . . . . . . . 23 M. L. Hetland Chapter 3 Indexing of Compressed Time Series . . . . . . . . . . . . . . . 43 E. Fink and K. B. Pratt Chapter 4 Indexing Time-Series under Conditions of Noise. . . . 67 M. Vlachos, D. Gunopulos and G. Das Chapter 5 Change Detection in Classification Models Induced from Time Series Data . . . . . . . . . . . . . . . . . . . . 101 G. Zeira, O. Maimon, M. Last and L. Rokach Chapter 6 Classification and Detection of Abnormal Events in Time Series of Graphs. . . . . . . . .127 H. Bunke and M. Kraetzl Chapter 7 Boosting Interval-Based Literals: Variable Length and Early Classification . . . . . . . . . . . 149 C. J. Alonso González and J. J. Rodrı́guez Diez Chapter 8 Median Strings: A Review . . . . . . . . . . . . . . . . . . . . . . . . .173 X. Jiang, H. Bunke and J. Csirik xi
  • 18. CHAPTER 1 SEGMENTING TIME SERIES: A SURVEY AND NOVEL APPROACH Eamonn Keogh Computer Science & Engineering Department, University of California — Riverside, Riverside, California 92521, USA E-mail: eamonn@cs.ucr.edu Selina Chu, David Hart, and Michael Pazzani Department of Information and Computer Science, University of California, Irvine, California 92697, USA E-mail: {selina, dhart, pazzani}@ics.uci.edu In recent years, there has been an explosion of interest in mining time series databases. As with most computer science problems, representa- tion of the data is the key to efficient and effective solutions. One of the most commonly used representations is piecewise linear approximation. This representation has been used by various researchers to support clus- tering, classification, indexing and association rule mining of time series data. A variety of algorithms have been proposed to obtain this represen- tation, with several algorithms having been independently rediscovered several times. In this chapter, we undertake the first extensive review and empirical comparison of all proposed techniques. We show that all these algorithms have fatal flaws from a data mining perspective. We introduce a novel algorithm that we empirically show to be superior to all others in the literature. Keywords: Time series; data mining; piecewise linear approximation; segmentation; regression. 1. Introduction In recent years, there has been an explosion of interest in mining time series databases. As with most computer science problems, representation of the data is the key to efficient and effective solutions. Several high level 1
  • 19. 2 E. Keogh, S. Chu, D. Hart and M. Pazzani (a) (b) Fig. 1. Two time series and their piecewise linear representation. (a) Space Shuttle Telemetry. (b) Electrocardiogram (ECG). representations of time series have been proposed, including Fourier Trans- forms [Agrawal et al. (1993), Keogh et al. (2000)], Wavelets [Chan and Fu (1999)], Symbolic Mappings [Agrawal et al. (1995), Das et al. (1998), Perng et al. (2000)] and Piecewise Linear Representation (PLR). In this work, we confine our attention to PLR, perhaps the most frequently used repre- sentation [Ge and Smyth (2001), Last et al. (2001), Hunter and McIntosh (1999), Koski et al. (1995), Keogh and Pazzani (1998), Keogh and Pazzani (1999), Keogh and Smyth (1997), Lavrenko et al. (2000), Li et al. (1998), Osaki et al. (1999), Park et al. (2001), Park et al. (1999), Qu et al. (1998), Shatkay (1995), Shatkay and Zdonik (1996), Vullings et al. (1997), Wang and Wang (2000)]. Intuitively, Piecewise Linear Representation refers to the approximation of a time series T, of length n, with K straight lines (hereafter known as segments). Figure 1 contains two examples. Because K is typically much smaller that n, this representation makes the storage, transmission and computation of the data more efficient. Specifically, in the context of data mining, the piecewise linear representation has been used to: • Support fast exact similarly search [Keogh et al. (2000)]. • Support novel distance measures for time series, including “fuzzy queries” [Shatkay (1995), Shatkay and Zdonik (1996)], weighted queries [Keogh and Pazzani (1998)], multiresolution queries [Wang and Wang (2000), Li et al. (1998)], dynamic time warping [Park et al. (1999)] and relevance feedback [Keogh and Pazzani (1999)]. • Support concurrent mining of text and time series [Lavrenko et al. (2000)]. • Support novel clustering and classification algorithms [Keogh and Pazzani (1998)]. • Support change point detection [Sugiura and Ogden (1994), Ge and Smyth (2001)].
  • 20. Segmenting Time Series: A Survey and Novel Approach 3 Surprisingly, in spite of the ubiquity of this representation, with the exception of [Shatkay (1995)], there has been little attempt to understand and compare the algorithms that produce it. Indeed, there does not even appear to be a consensus on what to call such an algorithm. For clarity, we will refer to these types of algorithm, which input a time series and return a piecewise linear representation, as segmentation algorithms. The segmentation problem can be framed in several ways. • Given a time series T, produce the best representation using only K segments. • Given a time series T, produce the best representation such that the maxi- mum error for any segment does not exceed some user-specified threshold, max error. • Given a time series T, produce the best representation such that the combined error of all segments is less than some user-specified threshold, total max error. As we shall see in later sections, not all algorithms can support all these specifications. Segmentation algorithms can also be classified as batch or online. This is an important distinction because many data mining problems are inherently dynamic [Vullings et al. (1997), Koski et al. (1995)]. Data mining researchers, who needed to produce a piecewise linear approximation, have typically either independently rediscovered an algo- rithm or used an approach suggested in related literature. For example, from the fields of cartography or computer graphics [Douglas and Peucker (1973), Heckbert and Garland (1997), Ramer (1972)]. In this chapter, we review the three major segmentation approaches in the literature and provide an extensive empirical evaluation on a very heterogeneous collection of datasets from finance, medicine, manufacturing and science. The major result of these experiments is that only online algo- rithm in the literature produces very poor approximations of the data, and that the only algorithm that consistently produces high quality results and scales linearly in the size of the data is a batch algorithm. These results motivated us to introduce a new online algorithm that scales linearly in the size of the data set, is online, and produces high quality approximations. The rest of the chapter is organized as follows. In Section 2, we provide an extensive review of the algorithms in the literature. We explain the basic approaches, and the various modifications and extensions by data miners. In Section 3, we provide a detailed empirical comparison of all the algorithms.
  • 21. 4 E. Keogh, S. Chu, D. Hart and M. Pazzani We will show that the most popular algorithms used by data miners can in fact produce very poor approximations of the data. The results will be used to motivate the need for a new algorithm that we will introduce and validate in Section 4. Section 5 offers conclusions and directions for future work. 2. Background and Related Work In this section, we describe the three major approaches to time series seg- mentation in detail. Almost all the algorithms have 2 and 3 dimensional analogues, which ironically seem to be better understood. A discussion of the higher dimensional cases is beyond the scope of this chapter. We refer the interested reader to [Heckbert and Garland (1997)], which contains an excellent survey. Although appearing under different names and with slightly different implementation details, most time series segmentation algorithms can be grouped into one of the following three categories: • Sliding Windows: A segment is grown until it exceeds some error bound. The process repeats with the next data point not included in the newly approximated segment. • Top-Down: The time series is recursively partitioned until some stopping criteria is met. • Bottom-Up: Starting from the finest possible approximation, segments are merged until some stopping criteria is met. Table 1 contains the notation used in this chapter. Table 1. Notation. T A time series in the form t1, t2, . . . , tn T[a : b] The subsection of T from a to b, ta, ta+1, . . . , tb Seg TS A piecewise linear approximation of a time series of length n with K segments. Individual segments can be addressed with Seg TS(i). create segment(T) A function that takes in a time series and returns a linear segment approximation of it. calculate error(T) A function that takes in a time series and returns the approximation error of the linear segment approximation of it. Given that we are going to approximate a time series with straight lines, there are at least two ways we can find the approximating line.
  • 22. Segmenting Time Series: A Survey and Novel Approach 5 • Linear Interpolation: Here the approximating line for the subsequence T[a : b] is simply the line connecting ta and tb. This can be obtained in constant time. • Linear Regression: Here the approximating line for the subsequence T[a : b] is taken to be the best fitting line in the least squares sense [Shatkay (1995)]. This can be obtained in time linear in the length of segment. The two techniques are illustrated in Figure 2. Linear interpolation tends to closely align the endpoint of consecutive segments, giving the piece- wise approximation a “smooth” look. In contrast, piecewise linear regression can produce a very disjointed look on some datasets. The aesthetic superi- ority of linear interpolation, together with its low computational complex- ity has made it the technique of choice in computer graphic applications [Heckbert and Garland (1997)]. However, the quality of the approximating line, in terms of Euclidean distance, is generally inferior to the regression approach. In this chapter, we deliberately keep our descriptions of algorithms at a high level, so that either technique can be imagined as the approximation technique. In particular, the pseudocode function create segment(T) can be imagined as using interpolation, regression or any other technique. All segmentation algorithms also need some method to evaluate the quality of fit for a potential segment. A measure commonly used in conjunc- tion with linear regression is the sum of squares, or the residual error. This is calculated by taking all the vertical differences between the best-fit line and the actual data points, squaring them and then summing them together. Another commonly used measure of goodness of fit is the distance between the best fit line and the data point furthest away in the vertical direction Linear Interpolation Linear Regression Fig. 2. Two 10-segment approximations of electrocardiogram data. The approxima- tion created using linear interpolation has a smooth aesthetically appealing appearance because all the endpoints of the segments are aligned. Linear regression, in contrast, pro- duces a slightly disjointed appearance but a tighter approximation in terms of residual error.
  • 23. 6 E. Keogh, S. Chu, D. Hart and M. Pazzani (i.e. the L∞ norm between the line and the data). As before, we have kept our descriptions of the algorithms general enough to encompass any error measure. In particular, the pseudocode function calculate error(T) can be imagined as using any sum of squares, furthest point, or any other measure. 2.1. The Sliding Window Algorithm The Sliding Window algorithm works by anchoring the left point of a poten- tial segment at the first data point of a time series, then attempting to approximate the data to the right with increasing longer segments. At some point i, the error for the potential segment is greater than the user-specified threshold, so the subsequence from the anchor to i − 1 is transformed into a segment. The anchor is moved to location i, and the process repeats until the entire time series has been transformed into a piecewise linear approx- imation. The pseudocode for the algorithm is shown in Table 2. The Sliding Window algorithm is attractive because of its great sim- plicity, intuitiveness and particularly the fact that it is an online algorithm. Several variations and optimizations of the basic algorithm have been pro- posed. Koski et al. noted that on ECG data it is possible to speed up the algorithm by incrementing the variable i by “leaps of length k” instead of 1. For k = 15 (at 400 Hz), the algorithm is 15 times faster with little effect on the output accuracy [Koski et al. (1995)]. Depending on the error measure used, there may be other optimizations possible. Vullings et al. noted that since the residual error is monotonically non-decreasing with the addition of more data points, one does not have to test every value of i from 2 to the final chosen value [Vullings et al. (1997)]. They suggest initially setting i to s, where s is the mean length of the previous segments. If the guess was pessimistic (the measured error Table 2. The generic Sliding Window algorithm. Algorithm Algorithm Algorithm Seg TS = Sliding Window(T, max error) anchor = 1; while not finished segmenting time series while not finished segmenting time series while not finished segmenting time series i = 2; while while while calculate error(T[anchor: anchor + i ]) < max error i = i + 1; end; end; end; Seg TS = concat(Seg TS, create segment(T[anchor: anchor + (i - 1)]);anchor = anchor + i; end; end; end;
  • 24. Segmenting Time Series: A Survey and Novel Approach 7 is still less than max error) then the algorithm continues to increment i as in the classic algorithm. Otherwise they begin to decrement i until the measured error is less than max error. This optimization can greatly speed up the algorithm if the mean length of segments is large in relation to the standard deviation of their length. The monotonically non-decreasing property of residual error also allows binary search for the length of the segment. Surprisingly, no one we are aware of has suggested this. The Sliding Window algorithm can give pathologically poor results under some circumstances, particularly if the time series in question con- tains abrupt level changes. Most researchers have not reported this [Qu et al. (1998), Wang and Wang (2000)], perhaps because they tested the algorithm on stock market data, and its relative performance is best on noisy data. Shatkay (1995), in contrast, does notice the problem and gives elegant examples and explanations [Shatkay (1995)]. They consider three variants of the basic algorithm, each designed to be robust to a certain case, but they underline the difficulty of producing a single variant of the algorithm that is robust to arbitrary data sources. Park et al. (2001) suggested modifying the algorithm to create “mono- tonically changing” segments [Park et al. (2001)]. That is, all segments con- sist of data points of the form of t1 ≤ t2 ≤ · · · ≤ tn or t1 ≥ t2 ≥ · · · ≥ tn. This modification worked well on the smooth synthetic dataset it was demonstrated on. But on real world datasets with any amount of noise, the approximation is greatly overfragmented. Variations on the Sliding Window algorithm are particularly popular with the medical community (where it is known as FAN or SAPA), since patient monitoring is inherently an online task [Ishijima et al. (1983), Koski et al. (1995), McKee et al. (1994), Vullings et al. (1997)]. 2.2. The Top-Down Algorithm The Top-Down algorithm works by considering every possible partitioning of the times series and splitting it at the best location. Both subsections are then tested to see if their approximation error is below some user- specified threshold. If not, the algorithm recursively continues to split the subsequences until all the segments have approximation errors below the threshold. The pseudocode for the algorithm is shown in Table 3. Variations on the Top-Down algorithm (including the 2-dimensional case) were independently introduced in several fields in the early 1970’s. In cartography, it is known as the Douglas-Peucker algorithm [Douglas and
  • 25. 8 E. Keogh, S. Chu, D. Hart and M. Pazzani Table 3. The generic Top-Down algorithm. Algorithm Algorithm Algorithm Seg TS = Top Down(T, max error) best so far = inf; for for for i = 2 to to to length(T) - 2 // Find the best splitting point. improvement in approximation = improvement splitting here(T, i); if if if improvement in approximation < best so far breakpoint = i; best so far = improvement in approximation; end; end; end; end; end; end; // Recursively split the left segment if necessary. if if if calculate error(T[1:breakpoint]) > max error Seg TS = Top Down(T[1:breakpoint]); end; end; end; // Recursively split the right segment if necessary. if if if calculate error(T[breakpoint + 1:length(T)]) > max error Seg TS = Top Down(T[breakpoint + 1:length(T)]); end; end; end; Peucker (1973)]; in image processing, it is known as Ramer’s algorithm [Ramer (1972)]. Most researchers in the machine learning/data mining com- munity are introduced to the algorithm in the classic textbook by Duda and Harts, which calls it “Iterative End-Points Fits” [Duda and Hart (1973)]. In the data mining community, the algorithm has been used by [Li et al. (1998)] to support a framework for mining sequence databases at multiple abstraction levels. Shatkay and Zdonik use it (after considering alternatives such as Sliding Windows) to support approximate queries in time series databases [Shatkay and Zdonik (1996)]. Park et al. introduced a modification where they first perform a scan over the entire dataset marking every peak and valley [Park et al. (1999)]. These extreme points used to create an initial segmentation, and the Top- Down algorithm is applied to each of the segments (in case the error on an individual segment was still too high). They then use the segmentation to support a special case of dynamic time warping. This modification worked well on the smooth synthetic dataset it was demonstrated on. But on real world data sets with any amount of noise, the approximation is greatly overfragmented. Lavrenko et al. uses the Top-Down algorithm to support the concurrent mining of text and time series [Lavrenko et al. (2000)]. They attempt to discover the influence of news stories on financial markets. Their algorithm contains some interesting modifications including a novel stopping criteria based on the t-test.
  • 26. Segmenting Time Series: A Survey and Novel Approach 9 Finally Smyth and Ge use the algorithm to produce a representation that can support a Hidden Markov Model approach to both change point detection and pattern matching [Ge and Smyth (2001)]. 2.3. The Bottom-Up Algorithm The Bottom-Up algorithm is the natural complement to the Top-Down algorithm. The algorithm begins by creating the finest possible approxima- tion of the time series, so that n/2 segments are used to approximate the n- length time series. Next, the cost of merging each pair of adjacent segments is calculated, and the algorithm begins to iteratively merge the lowest cost pair until a stopping criteria is met. When the pair of adjacent segments i and i + 1 are merged, the algorithm needs to perform some bookkeeping. First, the cost of merging the new segment with its right neighbor must be calculated. In addition, the cost of merging the i − 1 segments with its new larger neighbor must be recalculated. The pseudocode for the algorithm is shown in Table 4. Two and three-dimensional analogues of this algorithm are common in the field of computer graphics where they are called decimation methods [Heckbert and Garland (1997)]. In data mining, the algorithm has been used extensively by two of the current authors to support a variety of time series data mining tasks [Keogh and Pazzani (1999), Keogh and Pazzani (1998), Keogh and Smyth (1997)]. In medicine, the algorithm was used by Hunter and McIntosh to provide the high level representation for their medical pattern matching system [Hunter and McIntosh (1999)]. Table 4. The generic Bottom-Up algorithm. Algorithm Algorithm Algorithm Seg TS = Bottom Up(T, max error) for for for i = 1 : 2 : length(T) // Create initial fine approximation. Seg TS = concat(Seg TS, create segment(T[i: i + 1 ])); end; end; end; for for for i = 1 : length(Seg TS) - 1 // Find merging costs. merge cost(i) = calculate error([merge(Seg TS(i), Seg TS(i + 1))]); end; end; end; while while while min(merge cost) < max error // While not finished. p = min(merge cost); // Find ‘‘cheapest’’ pair to merge. Seg TS(p) = merge(Seg TS(p), Seg TS(p + 1)); // Merge them. delete(Seg TS(p + 1)); // Update records. merge cost(p) = calculate error(merge(Seg TS(p), Seg TS(p + 1))); merge cost(p - 1) = calculate error(merge(Seg TS(p - 1), Seg TS(p))); end; end; end;
  • 27. 10 E. Keogh, S. Chu, D. Hart and M. Pazzani 2.4. Feature Comparison of the Major Algorithms We have deliberately deferred the discussion of the running times of the algorithms until now, when the reader’s intuition for the various approaches are more developed. The running time for each approach is data dependent. For that reason, we discuss both a worst-case time that gives an upper bound and a best-case time that gives a lower bound for each approach. We use the standard notation of Ω(f(n)) for a lower bound, O(f(n)) for an upper bound, and θ(f(n)) for a function that is both a lower and upper bound. Definitions and Assumptions. The number of data points is n, the number of segments we plan to create is K, and thus the average segment length is L = n/K. The actual length of segments created by an algorithm varies and we will refer to the lengths as Li. All algorithms, except top-down, perform considerably worse if we allow any of the LI to become very large (say n/4), so we assume that the algo- rithms limit the maximum length L to some multiple of the average length. It is trivial to code the algorithms to enforce this, so the time analysis that follows is exact when the algorithm includes this limit. Empirical results show, however, that the segments generated (with no limit on length) are tightly clustered around the average length, so this limit has little effect in practice. We assume that for each set S of points, we compute a best segment and compute the error in θ(n) time. This reflects the way these algorithms are coded in practice, which is to use a packaged algorithm or function to do linear regression. We note, however, that we believe one can produce asymptotically faster algorithms if one custom codes linear regression (or other best fit algorithms) to reuse computed values so that the computation is done in less than O(n) time in subsequent steps. We leave that as a topic for future work. In what follows, all computations of best segment and error are assumed to be θ(n). Top-Down. The best time for Top-Down occurs if each split occurs at the midpoint of the data. The first iteration computes, for each split point i, the best line for points [1, i] and for points [i + 1, n]. This takes θ(n) for each split point, or θ(n2 ) total for all split points. The next iteration finds split points for [1, n/2] and for [n/2 + 1, n]. This gives a recurrence T(n) = 2T(n/2) + θ(n2 ) where we have T(2) = c, and this solves to T(n) = Ω(n2 ). This is a lower bound because we assumed the data has the best possible split points.
  • 28. Segmenting Time Series: A Survey and Novel Approach 11 The worst time occurs if the computed split point is always at one side (leaving just 2 points on one side), rather than the middle. The recurrence is T(n) = T(n − 2) + θ(n2 ) We must stop after K iterations, giving a time of O(n2 K). Sliding Windows. For this algorithm, we compute best segments for larger and larger windows, going from 2 up to at most cL (by the assumption we discussed above). The maximum time to compute a single segment is cL i=2 θ(i) = θ(L2 ). The number of segments can be as few as n/cL = K/c or as many as K. The time is thus θ(L2 K) or θ(Ln). This is both a best case and worst case bound. Bottom-Up. The first iteration computes the segment through each pair of points and the costs of merging adjacent segments. This is easily seen to take O(n) time. In the following iterations, we look up the minimum error pair i and i + 1 to merge; merge the pair into a new segment Snew; delete from a heap (keeping track of costs is best done with a heap) the costs of merging segments i−1 and i and merging segments i+1 and i+2; compute the costs of merging Snew with Si−1 and with Si−2; and insert these costs into our heap of costs. The time to look up the best cost is θ(1) and the time to add and delete costs from the heap is O(log n). (The time to construct the heap is O(n).) In the best case, the merged segments always have about equal length, and the final segments have length L. The time to merge a set of length 2 segments, which will end up being one length L segment, into half as many segments is θ(L) (for the time to compute the best segment for every pair of merged segments), not counting heap operations. Each iteration takes the same time repeating θ(log L) times gives a segment of size L. The number of times we produce length L segments is K, so the total time is Ω(K L log L) = Ω(n log n/K). The heap operations may take as much as O(n log n). For a lower bound we have proven just Ω(n log n/K). In the worst case, the merges always involve a short and long segment, and the final segments are mostly of length cL. The time to compute the cost of merging a length 2 segment with a length i segment is θ(i), and the time to reach a length cL segment is cL i=2 θ(i) = θ(L2 ). There are at most n/cL such segments to compute, so the time is n/cL × θ(L2 ) = O(Ln). (Time for heap operations is inconsequential.) This complexity study is summarized in Table 5. In addition to the time complexity there are other features a practitioner might consider when choosing an algorithm. First there is the question of
  • 29. 12 E. Keogh, S. Chu, D. Hart and M. Pazzani Table 5. A feature summary for the 3 major algorithms. Algorithm User can Online Complexity specify1 Top-Down E, ME, K No O(n2K) Bottom-Up E, ME, K No O(Ln) Sliding Window E Yes O(Ln) 1KEY: E → Maximum error for a given segment, ME → Maximum error for a given segment for entire time series, K → Number of segments. whether the algorithm is online or batch. Secondly, there is the question of how the user can specify the quality of desired approximation. With trivial modifications the Bottom-Up algorithm allows the user to specify the desired value of K, the maximum error per segment, or total error of the approximation. A (non-recursive) implementation of Top-Down can also be made to support all three options. However Sliding Window only allows the maximum error per segment to be specified. 3. Empirical Comparison of the Major Segmentation Algorithms In this section, we will provide an extensive empirical comparison of the three major algorithms. It is possible to create artificial datasets that allow one of the algorithms to achieve zero error (by any measure), but forces the other two approaches to produce arbitrarily poor approximations. In contrast, testing on purely random data forces the all algorithms to pro- duce essentially the same results. To overcome the potential for biased results, we tested the algorithms on a very diverse collection of datasets. These datasets where chosen to represent the extremes along the fol- lowing dimensions, stationary/non-stationary, noisy/smooth, cyclical/non- cyclical, symmetric/asymmetric, etc. In addition, the data sets represent the diverse areas in which data miners apply their algorithms, includ- ing finance, medicine, manufacturing and science. Figure 3 illustrates the 10 datasets used in the experiments. 3.1. Experimental Methodology For simplicity and brevity, we only include the linear regression versions of the algorithms in our study. Since linear regression minimizes the sum of squares error, it also minimizes the Euclidean distance (the Euclidean
  • 30. Segmenting Time Series: A Survey and Novel Approach 13 (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x) Fig. 3. The 10 datasets used in the experiments. (i) Radio Waves. (ii) Exchange Rates. (iii) Tickwise II. (iv) Tickwise I. (v) Water Level. (vi) Manufacturing. (vii) ECG. (viii) Noisy Sine Cubed. (ix) Sine Cube. (x) Space Shuttle. distance is just the square root of the sum of squares). Euclidean dis- tance, or some measure derived from it, is by far the most common metric used in data mining of time series [Agrawal et al. (1993), Agrawal et al. (1995), Chan and Fu (1999), Das et al. (1998), Keogh et al. (2000), Keogh and Pazzani (1999), Keogh and Pazzani (1998), Keogh and Smyth (1997), Qu et al. (1998), Wang and Wang (2000)]. The linear interpolation ver- sions of the algorithms, by definition, will always have a greater sum of squares error. We immediately encounter a problem when attempting to compare the algorithms. We cannot compare them for a fixed number of segments, since Sliding Windows does not allow one to specify the number of segments. Instead we give each of the algorithms a fixed max error and measure the total error of the entire piecewise approximation. The performance of the algorithms depends on the value of max error. As max error goes to zero all the algorithms have the same performance, since they would produce n/2 segments with no error. At the opposite end, as max error becomes very large, the algorithms once again will all have the same performance, since they all simply approximate T with a single best-fit line. Instead, we must test the relative performance for some rea- sonable value of max error, a value that achieves a good trade off between compression and fidelity. Because this “reasonable value” is subjective and dependent on the data mining application and the data itself, we did the fol- lowing. We chose what we considered a “reasonable value” of max error for each dataset, and then we bracketed it with 6 values separated by powers of two. The lowest of these values tends to produce an over-fragmented approx- imation, and the highest tends to produce a very coarse approximation. So in general, the performance in the mid-range of the 6 values should be considered most important. Figure 4 illustrates this idea.
  • 31. 14 E. Keogh, S. Chu, D. Hart and M. Pazzani Too fine an approximation “Correct” approximation Too coarse an approximation max_error = E × 24 max_error = E × 25 max_error = E × 26 max_error = E × 21 max_error = E × 22 max_error = E × 23 Fig. 4. We are most interested in comparing the segmentation algorithms at the set- ting of the user-defined threshold max error that produces an intuitively correct level of approximation. Since this setting is subjective we chose a value for E, such that max error = E × 2i (i = 1 to 6), brackets the range of reasonable approximations. Since we are only interested in the relative performance of the algo- rithms, for each setting of max error on each data set, we normalized the performance of the 3 algorithms by dividing by the error of the worst per- forming approach. 3.2. Experimental Results The experimental results are summarized in Figure 5. The most obvious result is the generally poor quality of the Sliding Windows algorithm. With a few exceptions, it is the worse performing algorithm, usually by a large amount. Comparing the results for Sine cubed and Noisy Sine supports our con- jecture that the noisier a dataset, the less difference one can expect between algorithms. This suggests that one should exercise caution in attempting to generalize the performance of an algorithm that has only been demon- strated on a single noisy dataset [Qu et al. (1998), Wang and Wang (2000)]. Top-Down does occasionally beat Bottom-Up, but only by small amount. On the other hand Bottom-Up often significantly out performs Top-Down, especially on the ECG, Manufacturing and Water Level data sets. 4. A New Approach Given the noted shortcomings of the major segmentation algorithms, we investigated alternative techniques. The main problem with the Sliding Windows algorithm is its inability to look ahead, lacking the global view of its offline (batch) counterparts. The Bottom-Up and the Top-Down
  • 32. Segmenting Time Series: A Survey and Novel Approach 15 E*2 1 E*2 2 E*2 3 E*2 4 E*2 5 E*2 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 Space Shuttle Sine Cubed Noisy Sine Cubed ECG Manufacturing Water Level Tickwise 1 Tickwise 2 Exchange Rate Radio Waves Fig. 5. A comparison of the three major times series segmentation algorithms, on ten diverse datasets, over a range in parameters. Each experimental result (i.e. a triplet of histogram bars) is normalized by dividing by the performance of the worst algorithm on that experiment. approaches produce better results, but are offline and require the scan- ning of the entire data set. This is impractical or may even be unfeasible in a data-mining context, where the data are in the order of terabytes or arrive in continuous streams. We therefore introduce a novel approach in which we capture the online nature of Sliding Windows and yet retain the supe- riority of Bottom-Up. We call our new algorithm SWAB (Sliding Window and Bottom-up). 4.1. The SWAB Segmentation Algorithm The SWAB algorithm keeps a buffer of size w. The buffer size should ini- tially be chosen so that there is enough data to create about 5 or 6 segments.
  • 33. 16 E. Keogh, S. Chu, D. Hart and M. Pazzani Bottom-Up is applied to the data in the buffer and the leftmost segment is reported. The data corresponding to the reported segment is removed from the buffer and more datapoints are read in. The number of datapoints read in depends on the structure of the incoming data. This process is per- formed by the Best Line function, which is basically just classic Sliding Windows. These points are incorporated into the buffer and Bottom-Up is applied again. This process of applying Bottom-Up to the buffer, report- ing the leftmost segment, and reading in the next “best fit” subsequence is repeated as long as data arrives (potentially forever). The intuition behind the algorithm is this. The Best Line function finds data corresponding to a single segment using the (relatively poor) Sliding Windows and gives it to the buffer. As the data moves through the buffer the (relatively good) Bottom-Up algorithm is given a chance to refine the segmentation, because it has a “semi-global” view of the data. By the time the data is ejected from the buffer, the segmentation breakpoints are usually the same as the ones the batch version of Bottom-Up would have chosen. Table 6 shows the pseudo code for the algorithm. Table 6. The SWAB (Sliding Window and Bottom-up) algorithm. Algorithm Algorithm Algorithm Seg TS = SWAB(max error, seg num) // seg num is a small integer, i.e. 5 or 6 read in w number of data points read in w number of data points read in w number of data points // Enough to approximate lower bound = w / 2; // seg num of segments. upper bound = 2 * w; while while while data at input T = Bottom Up(w, max error) // Call the Bottom-Up algorithm. Seg TS = CONCAT(SEG TS, T(1)); w = TAKEOUT(w, w); // Deletes w points in T(1) from w. if if if data at input // Add w points from BEST LINE() to w. w = CONCAT(w, BEST LINE(max error)); {check upper and lower bound, adjust if necessary} else else else // flush approximated segments from buffer. Seg TS = CONCAT(SEG TS, (T - T(1))) end; end; end; end; end; end; Function Function Function S = BEST LINE(max error) // returns S points to approximate. while while while error ≤ max error // next potential segment. read in one additional data point, d, into S S = CONCAT(S, d); error = approx segment(S); end while; end while; end while; return return return S;
  • 34. Segmenting Time Series: A Survey and Novel Approach 17 Using the buffer allows us to gain a “semi-global” view of the data set for Bottom-Up. However, it important to impose upper and lower bounds on the size of the window. A buffer that is allowed to grow arbitrarily large will revert our algorithm to pure Bottom-Up, but a small buffer will deteriorate it to Sliding Windows, allowing excessive fragmentation may occur. In our algorithm, we used an upper (and lower) bound of twice (and half) of the initial buffer. Our algorithm can be seen as operating on a continuum between the two extremes of Sliding Windows and Bottom-Up. The surprising result (demonstrated below) is that by allowing the buffer to contain just 5 or 6 times the data normally contained by is a single segment, the algorithm produces essentially the same results as Bottom-Up, yet is able process a never-ending stream of data. Our new algorithm requires only a small, constant amount of memory, and the time complexity is a small constant factor worse than that of the standard Bottom-Up algorithm. 4.2. Experimental Validation We repeated the experiments in Section 3, this time comparing the new algorithm with pure (batch) Bottom-Up and classic Sliding Windows. The result, summarized in Figure 6, is that the new algorithm produces results that are essentiality identical to Bottom-Up. The reader may be surprised that SWAB can sometimes be slightly better than Bottom-Up. The reason why this can occur is because SWAB is exploring a slight larger search space. Every segment in Bottom-Up must have an even number of data- points, since it was created by merging other segments that also had an even number of segments. The only possible exception is the rightmost segment, which can have an even number of segments if the original time series had an odd length. Since this happens multiple times for SWAB, it is effectively searching a slight larger search space. 5. Conclusions and Future Directions We have seen the first extensive review and empirical comparison of time series segmentation algorithms from a data mining perspective. We have shown the most popular approach, Sliding Windows, generally produces very poor results, and that while the second most popular approach, Top- Down, can produce reasonable results, it does not scale well. In contrast, the least well known, Bottom-Up approach produces excellent results and scales linearly with the size of the dataset.
  • 35. 18 E. Keogh, S. Chu, D. Hart and M. Pazzani 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 E*2 1 E*2 2 E*2 3 E*2 4 E*2 5 E*2 6 Sine Cubed Noisy Sine Cubed ECG Manufacturing Water Level Tickwise 2 Exchange Rate Radio Waves Space Shuttle Tickwise 1 Fig. 6. A comparison of the SWAB algorithm with pure (batch) Bottom-Up and classic Sliding Windows, on ten diverse datasets, over a range in parameters. Each experimental result (i.e. a triplet of histogram bars) is normalized by dividing by the performance of the worst algorithm on that experiment. In addition, we have introduced SWAB, a new online algorithm, which scales linearly with the size of the dataset, requires only constant space and produces high quality approximations of the data. There are several directions in which this work could be expanded. • The performance of Bottom-Up is particularly surprising given that it explores a smaller space of representations. Because the initialization phase of the algorithm begins with all line segments having length two, all merged segments will also have even lengths. In contrast the two other algorithms allow segments to have odd or even lengths. It would be
  • 36. Segmenting Time Series: A Survey and Novel Approach 19 interesting to see if removing this limitation of Bottom-Up can improve its performance further. • For simplicity and brevity, we have assumed that the inner loop of the SWAB algorithm simply invokes the Bottom-Up algorithm each time. This clearly results in some computation redundancy. We believe we may be able to reuse calculations from previous invocations of Bottom-Up, thus achieving speedup. Reproducible Results Statement: In the interests of competitive scientific inquiry, all datasets and code used in this work are freely available at the University of California Riverside, Time Series Data Mining Archive {www.cs.ucr.edu/∼eamonn/TSDMA/index.html}. References 1. Agrawal, R., Faloutsos, C., and Swami, A. (1993). Efficient Similarity Search in Sequence Databases. Proceedings of the 4th Conference on Foundations of Data Organization and Algorithms, pp. 69–84. 2. Agrawal, R., Lin, K.I., Sawhney, H.S., and Shim, K. (1995). Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Times-Series Databases. Proceedings of 21th International Conference on Very Large Data Bases, pp. 490–501. 3. Chan, K. and Fu, W. (1999). Efficient Time Series Matching by Wavelets. Proceedings of the 15th IEEE International Conference on Data Engineering, pp. 126–133. 4. Das, G., Lin, K. Mannila, H., Renganathan, G., and Smyth, P. (1998). Rule Discovery from Time Series. Proceedings of the 3rd International Conference of Knowledge Discovery and Data Mining, pp. 16–22. 5. Douglas, D.H. and Peucker, T.K. (1973). Algorithms for the Reduction of the Number of Points Required to Represent a Digitized Line or its Caricature. Canadian Cartographer, 10(2) December, pp. 112–122. 6. Duda, R.O. and Hart, P.E. (1973). Pattern Classification and Scene Analysis. Wiley, New York. 7. Ge, X. and Smyth P. (2001). Segmental Semi-Markov Models for Endpoint Detection in Plasma Etching. IEEE Transactions on Semiconductor Engi- neering. 8. Heckbert, P.S. and Garland, M. (1997). Survey of Polygonal Surface Simpli- fication Algorithms, Multiresolution Surface Modeling Course. Proceedings of the 24th International Conference on Computer Graphics and Interactive Techniques. 9. Hunter, J. and McIntosh, N. (1999). Knowledge-Based Event Detection in Complex Time Series Data. Artificial Intelligence in Medicine, Springer, pp. 271–280.
  • 37. 20 E. Keogh, S. Chu, D. Hart and M. Pazzani 10. Ishijima, M.. et al. (1983). Scan-Along Polygonal Approximation for Data Compression of Electrocardiograms. IEEE Transactions on Biomedical Engi- neering (BME), 30(11), 723–729. 11. Koski, A., Juhola, M., and Meriste, M. (1995). Syntactic Recognition of ECG Signals By Attributed Finite Automata. Pattern Recognition, 28(12), 1927– 1940. 12. Keogh, E., Chakrabarti, K., Pazzani, M., and Mehrotra, S. (2000). Dimen- sionality Reduction for Fast Similarity Search in Large Time Series Databases. Journal of Knowledge and Information Systems, 3(3), 263–286. 13. Keogh, E. and Pazzani, M. (1998). An Enhanced Representation of Time Series which Allows Fast and Accurate Classification, Clustering and Rele- vance Feedback. Proceedings of the 4th International Conference of Knowl- edge Discovery and Data Mining, AAAI Press, pp. 239–241. 14. Keogh, E. and Pazzani, M. (1999). Relevance Feedback Retrieval of Time Series Data. Proceedings of the 22th Annual International ACM-SIGIR Con- ference on Research and Development in Information Retrieval, pp. 183–190. 15. Keogh, E. and Smyth, P. (1997). A Probabilistic Approach to Fast Pattern Matching in Time Series Databases. Proceedings of the 3rd International Con- ference of Knowledge Discovery and Data Mining, pp. 24–20. 16. Last, M., Klein, Y., and Kandel, A. (2001). Knowledge Discovery in Time Series Databases. IEEE Transactions on Systems, Man, and Cybernetics, 31B(1), 160–169. 17. Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., and Allan, J. (2000). Mining of Concurrent Text and Time Series. Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining, 37–44. 18. Li, C,. Yu, P., and Castelli, V. (1998). MALM: A Framework for Mining Sequence Database at Multiple Abstraction Levels. Proceedings of the 9th International Conference on Information and Knowledge Management, pp. 267–272. 19. McKee, J.J, Evans, N.E, and Owens, F.J (1994). Efficient Implementation of the Fan/SAPA-2 Algorithm Using Fixed Point Arithmetic. Automedica, 16, 109–117. 20. Osaki, R., Shimada, M., and Uehara, K. (1999). Extraction of Primitive Motion for Human Motion Recognition. Proceedings of the 2nd International Conference on Discovery Science, pp. 351–352. 21. Park, S., Kim, S.W, and Chu, W.W (2001). Segment-Based Approach for Subsequence Searches in Sequence Databases. Proceedings of the 16th ACM Symposium on Applied Computing, pp. 248–252. 22. Park, S., Lee, D., and Chu, W.W (1999). Fast Retrieval of Similar Subse- quences in Long Sequence Databases. Proceedings of the 3rd IEEE Knowledge and Data Engineering Exchange Workshop. 23. Pavlidis, T. (1976). Waveform Segmentation Through Functional Approxi- mation. IEEE Transactions on Computers, pp. 689–697. 24. Perng, C., Wang, H., Zhang, S., and Parker, S. (2000). Landmarks: A New Model for Similarity-Based Pattern Querying in Time Series Databases. Pro- ceedings of 16th International Conference on Data Engineering, pp. 33–45.
  • 38. Segmenting Time Series: A Survey and Novel Approach 21 25. Qu, Y., Wang, C., and Wang, S. (1998). Supporting Fast Search in Time Series for Movement Patterns in Multiples Scales, Proceedings of the 7th International Conference on Information and Knowledge Management, pp. 251–258. 26. Ramer, U. (1972). An Iterative Procedure for the Polygonal Approximation of Planar Curves. Computer Graphics and Image Processing, 1, 244–256. 27. Shatkay, H. (1995). Approximate Queries and Representations for Large Data Sequences. Technical Report cs-95-03, Department of Computer Sci- ence, Brown University. 28. Shatkay, H. and Zdonik, S. (1996). Approximate Queries and Representa- tions for Large Data Sequences. Proceedings of the 12th IEEE International Conference on Data Engineering, pp. 546–553. 29. Sugiura, N. and Ogden, R.T (1994). Testing Change-Points with Linear Trend. Communications in Statistics B: Simulation and Computation, 23, 287–322. 30. Vullings, H.J L.M., Verhaegen, M.H.G., and Verbruggen H.B. (1997). ECG Segmentation Using Time-Warping. Proceedings of the 2nd International Symposium on Intelligent Data Analysis, pp. 275–286. 31. Wang, C. and Wang, S. (2000). Supporting Content-Based Searches on Time Series Via Approximation. Proceedings of the 12th International Conference on Scientific and Statistical Database Management, pp. 69–81.
  • 40. CHAPTER 2 A SURVEY OF RECENT METHODS FOR EFFICIENT RETRIEVAL OF SIMILAR TIME SEQUENCES Magnus Lie Hetland Norwegian University of Science and Technology Sem Sælands vei 7–9 NO-7491 Trondheim, Norway E-mail: magnus@hetland.org Time sequences occur in many applications, ranging from science and technology to business and entertainment. In many of these applica- tions, searching through large, unstructured databases based on sample sequences is often desirable. Such similarity-based retrieval has attracted a great deal of attention in recent years. Although several different approaches have appeared, most are based on the common premise of dimensionality reduction and spatial access methods. This chapter gives an overview of recent research and shows how the methods fit into a general context of signature extraction. Keywords: Information retrieval; sequence databases; similarity search; spatial indexing; time sequences. 1. Introduction Time sequences arise in many applications—any applications that involve storing sensor inputs, or sampling a value that changes over time. A problem which has received an increasing amount of attention lately is the problem of similarity retrieval in databases of time sequences, so-called “query by example.” Some uses of this are [Agrawal et al. (1993)]: • Identifying companies with similar patterns of growth. • Determining products with similar selling patterns. • Discovering stocks with similar movement in stock prices. 23
  • 41. 24 M. L. Hetland • Finding out whether a musical score is similar to one of a set of copy- righted scores. • Finding portions of seismic waves that are not similar to spot geological irregularities. Applications range from medicine, through economy, to scientific disci- plines such as meteorology and astrophysics [Faloutsos et al. (1994), Yi and Faloutsos (2000)]. The running times of simple algorithms for comparing time sequences are generally polynomial in the length of both sequences, typically linear or quadratic. To find the correct offset of a query in a large database, a naive sequential scan will require a number of such comparisons that is linear in the length of the database. This means that, given a query of length m and a database of length n, the search will have a time complexity of O(nm), or even O(nm2 ) or worse. For large databases this is clearly unacceptable. Many methods are known for performing this sort of query in the domain of strings over finite alphabets, but with time sequences there are a few extra issues to deal with: • The range of values is not generally finite, or even discrete. • The sampling rate may not be constant. • The presence of noise in various forms makes it necessary to support very flexible similarity measures. This chapter describes some of the recent advances that have been made in this field; methods that allow for indexing of time sequences using flexible similarity measures that are invariant under a wide range of transformations and error sources. The chapter is structured as follows: Section 2 gives a more formal presentation of the problem of similarity-based retrieval and the so-called dimensionality curse; Section 3 describes the general approach of signature based retrieval, or shrink and search, as well as three specific methods using this approach; Section 4 shows some other approaches, while Section 5 concludes the chapter. Finally, Appendix gives an overview of some basic distance measures.1 1The term “distance” is used loosely in this paper. A distance measure is simply the inverse of a similarity measure and is not required to obey the metric axioms.
  • 42. A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 25 1.1. Terminology and Notation A time sequence x = x1 = (v1, t1), . . . , xn = (vn, tn) is an ordered col- lection of elements xi, each consisting of a value vi and a timestamp ti. Abusing the notation slightly, the value of xi may be referred to as xi. For some retrieval methods, the values may be taken from a finite class of values [Mannila and Ronkainen (1997)], or may have more than one dimension [Lee et al. (2000)], but it is generally assumed that the values are real numbers. This assumption is a requirement for most of the methods described in this chapter. The only requirement of the timestamps is that they be non-decreasing (or, in some applications, strictly increasing) with respect to the sequence indices: ti ≤ tj ⇔ i ≤ j. (1) In some methods, an additional assumption is that the elements are equi-spaced: for every two consecutive elements xi and xi+1 we have ti+1 − ti = ∆, (2) where ∆ (the sampling rate of x) is a (positive) constant. If the actual sampling rate is not important, ∆ may be normalized to 1, and t1 to 0. It is also possible to resample the sequence to make the elements equi-spaced, when required. The length of a time sequence x is its cardinality, written as | x|. The contiguous subsequence of x containing elements xi to xj (inclusive) is written xi:j. A signature of a sequence x is some structure that somehow represents x, yet is simpler than x. In the context of this chapter, such a signature will always be a vector of fixed size k. (For a more thorough discussion of signatures, see Section 3.) Such a signature is written x. For a summary of the notation, see Table 1. Table 1. Notation. x A sequence x̃ A signature of x xi Element number i of x xi:j Elements i to j (inclusive) of x | x| The length of x
  • 43. 26 M. L. Hetland 2. The Problem The problem of retrieving similar time sequences may be stated as follows: Given a sequence q, a set of time sequences X, a (non-negative) distance measure d, and a tolerance threshold ε, find the set R of sequences closer to q than ε, or, more precisely: R = { x ∈ X|d( q, x) ≤ ε}. (3) Alternatively, one might wish to find the k nearest neighbours of q, which amounts to setting ε so that |R| = k. The parameter ε is typically supplied by the user, while the distance function d is domain-dependent. Several distance measures will be described rather informally in this chapter. For more formal definitions, see Appendix. Figure 1 illustrates the problem for Euclidean distance in two dimen- sions. In this example, the vector x will be included in the result set R, while y will not. A useful variation of the problem is to find a set of subsequences of the sequences in X. This, in the basic case, requires comparing q not only to all elements of X, but to all possible subsequences.2 If a method retrieves a subset S of R, the wrongly dismissed sequences in R − S are called false dismissals. Conversely, if S is a superset of R, the sequences in S − R are called false alarms. Fig. 1. Similarity retrieval. 2Except in the description of LCS in Appendix, subsequence means contiguous subse- quence, or segment.
  • 44. A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 27 2.1. Robust Distance Measures The choice of distance measure is highly domain dependent, and in some cases a simple Lp norm such as Euclidean distance may be sufficient. How- ever, in many cases, this may be too brittle [Keogh and Pazzani (1999b)] since it does not tolerate such transformations as scaling, warping, or trans- lation along either axis. Many of the newer retrieval methods focus on using more robust distance measures, which are invariant under such transforma- tions as time warping (see Appendix for details) without loss of perfor- mance. 2.2. Good Indexing Methods Faloutsos et al. (1994) list the following desirable properties for an indexing method: (i) It should be faster than a sequential scan. (ii) It should incur little space overhead. (iii) It should allow queries of various length. (iv) It should allow insertions and deletions without rebuilding the index. (v) It should be correct: No false dismissals must occur. To achieve high performance, the number of false alarms should also be low. Keogh et al. (2001b) add the following criteria to the list above: (vi) It should be possible to build the index in reasonable time. (vii) The index should preferably be able to handle more than one distance measure. 2.3. Spatial Indices and the Dimensionality Curse The general problem of similarity based retrieval is well known in the field of information retrieval, and many indexing methods exist to process queries efficiently [Baeza-Yates and Ribeiro-Neto (1999)]. However, certain prop- erties of time sequences make the standard methods unsuitable. The fact that the value ranges of the sequences usually are continuous, and that the elements may not be equi-spaced, makes it difficult to use standard text-indexing techniques such as suffix-trees. One of the most promising techniques is multidimensional indexing (R-trees [Guttman (1984)], for instance), in which the objects in question are multidimensional vectors, and similar objects can be retrieved in sublinear time. One requirement of such spatial access methods is that the distance measure must be monotonic
  • 45. 28 M. L. Hetland in all dimensions, usually satisfied through the somewhat stricter require- ment of the triangle inequality (d( x, z) ≤ d( x, y) + d( y, z)). One important problem that occurs when trying to index sequences with spatial access methods is the so-called dimensionality curse: Spatial indices typically work only when the number of dimensions is low [Chakrabarti and Mehrotra (1999)]. This makes it unfeasible to code the entire sequence directly as a vector in an indexed space. The general solution to this problem is dimensionality reduction: to condense the original sequences into signatures in a signature space of low dimensionality, in a manner which, to some extent, preserves the distances between them. One can then index the signature space. 3. Signature Based Similarity Search A time sequence x of length n can be considered a vector or point in an n-dimensional space. Techniques exist (spatial access methods, such as the R-tree and variants [Chakrabarti and Mehrotra (1999), Wang and Perng (2001), Sellis et al. (1987)] for indexing such data. The problem is that the performance of such methods degrades considerably even for relatively low dimensionalities [Chakrabarti and Mehrotra (1999)]; the number of dimensions that can be handled is usually several orders of magnitude lower than the number of data points in a typical time sequence. A general solution described by Faloutsos et al. (1994; 1997) is to extract a low-dimensional signature from each sequence, and to index the signature space. This shrink and search approach is illustrated in Figure 2. Fig. 2. The signature based approach.
  • 46. A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 29 An important result given by Faloutsos et al. (1994) is the proof that in order to guarantee completeness (no false dismissals), the distance function used in the signature space must underestimate the true distance mea- sure, or: dk(x̃, ỹ) ≤ d( x, y). (4) This requirement is called the bounding lemma. Assuming that (1.4) holds, an intuitive way of stating the resulting situation is: “if two signa- tures are far apart, we know the corresponding [sequences] must also be far apart” [Faloutsos et al. (1997)]. This, of course, means that there will be no false dismissals. To minimise the number of false alarms, we want dk to approximate d as closely as possible. The bounding lemma is illustrated in Figure 3. This general method of dimensionality reduction may be summed up as follows [Keogh et al. (2001b)]: 1. Establish a distance measure d from a domain expert. 2. Design a dimensionality reduction technique to produce signatures of length k, where k can be efficiently handled by a standard spatial access method. 3. Produce a distance measure dk over the k-dimensional signature space, and prove that it obeys the bounding condition (4). In some applications, the requirement in (4) is relaxed, allowing for a small number of false dismissals in exchange for increased performance. Such methods are called approximate. The dimensionality reduction may in itself be used to speed up the sequential scan, and some methods (such as the piecewise linear approxi- mation of Keogh et al., which is described in Section 4.2) rely only on this, without using any index structure. Fig. 3. An intuitive view of the bounding lemma.
  • 47. 30 M. L. Hetland Methods exist for finding signatures of arbitrary objects, given the dis- tances between them [Faloutsos and Lin (1995), Wang et al. (1999)], but in the following I will concentrate on methods that exploit the structure of the time series to achieve good approximations. 3.1. A Simple Example As an example of the signature based scheme, consider the two sequences shown in Figure 4. The sequences, x and y, are compared using the L1 measure (Manhattan distance), which is simply the sum of the absolute distances between each aligning pair of values. A simple signature in this scheme is the prefix of length 2, as indicated by the shaded area in the figure. As shown in Figure 5, these signatures may be interpreted as points in a two-dimensional plane, which can be indexed with some standard spatial indexing method. It is also clear that the signature distance will underestimate the real distance between the sequences, since the remaining summands of the real distance must all be positive. Fig. 4. Comparing two sequences. Fig. 5. A simple signature distance.
  • 48. A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 31 Fig. 6. An example time sequence. Although correct, this simple signature extraction technique is not par- ticularly precise. The signature extraction methods introduced in the fol- lowing sections take into account more information about the full sequence shape, and therefore lead to fewer false alarms. Figure 6 shows a time series containing measurements of atmospheric pressure. In the following three sections, the methods described will be applied to this sequence, and the resulting simplified sequence (recon- structed from the extracted signature) will be shown superimposed on the original. 3.2. Spectral Signatures Some of the methods presented in this section are not very recent, but introduce some of the main concepts used by newer approaches. Agrawal et al. (1993) introduce a method called the F-index in which a signature is extracted from the frequency domain of a sequence. Underlying their approach are two key observations: • Most real-world time sequences can be faithfully represented by their strongest Fourier coefficients. • Euclidean distance is preserved in the frequency domain (Parseval’s Theorem [Shatkay (1995)]). Based on this, they suggest performing the Discrete Fourier Transform on each sequence, and using a vector consisting of the sequence’s k first amplitude coefficients as its signature. Euclidean distance in the signa- ture space will then underestimate the real Euclidean distance between the sequences, as required. Figure 7 shows an approximated time sequence, reconstructed from a signature consisting of the original sequence’s ten first Fourier components. This basic method allows only for whole-sequence matching. In 1994, Faloutsos et al. introduce the ST-index, an improvement on the F-index
  • 49. 32 M. L. Hetland Fig. 7. A sequence reconstructed from a spectral signature. that makes subsequence matching possible. The main steps of the approach are as follows: 1. For each position in the database, extract a window of length w, and create a spectral signature (a point) for it. Each point will be close to the previous, because the contents of the sliding window change slowly. The points for one sequence will therefore constitute a trail in signature space. 2. Partition the trails into suitable (multidimensional) Minimal Bounding Rectangles (MBRs), according to some heuristic. 3. Store the MBRs in a spatial index structure. To search for subsequences similar to a query q of length w, simply look up all MBRs that intersect a hypersphere with radius ε around the signature point q̃. This is guaranteed not to produce any false dismissals, because if a point is within a radius of ε of q̃, it cannot possibly be contained in an MBR that does not intersect the hypersphere. To search for sequences longer than w, split the query into w-length segments, search for each of them, and intersect the result sets. Because a sequence in the result set R cannot be closer to the full query sequence than it is to any one of the window signatures, it has to be close to all of them, that is, contained in all the result sets. These two papers [Agrawal et al. (1993) and Faloutsos et al. (1994)] are seminal; several newer approaches are based on them. For example, Rafiei and Mendelzon (1997) show how the method can be made more robust by allowing various transformations in the comparison, and Chan and Fu (1999) show how the Discrete Wavelet Transform (DWT) can be used instead of the Discrete Fourier Transform (DFT), and that the DWT method is empirically superior. See Wu et al. (2000) for a comparison between similarity search based on DFT and DWT.
  • 50. A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 33 3.3. Piecewise Constant Approximation An approach independently introduced by Yi and Faloutsos (2000) and Keogh et al. (2001b), Keogh and Pazzani (2000) is to divide each sequence into k segments of equal length, and to use the average value of each seg- ment as a coordinate of a k-dimensional signature vector. Keogh et al. call the method Piecewise Constant Approximation, or PCA. This deceptively simple dimensionality reduction technique has several advantages [Keogh et al. (2001b)]: The transform itself is faster than most other transforms, it is easy to understand and implement, it supports more flexible distance measures than Euclidean distance, and the index can be built in linear time. Figure 8 shows an approximated time sequence, reconstructed from a ten-dimensional PCA signature. Yi and Faloutsos (2000) also show that this signature can be used with arbitrary Lp norms without changing the index structure, which is some- thing no previous method [such as Agrawal et al. (1993; 1995), Faloutsos et al. (1994; 1997), Rafiei and Mendelzon (1997), or Yi et al. (1998)] could accomplish. This means that the distance measure may be specified by the user. Preprocessing to make the index more robust in the face of such trans- formations as offset translation, amplitude scaling, and time scaling can also be performed. Keogh et al. demonstrate that the representation can also be used with the so-called weighted Euclidean distance, where each part of the sequence has a different weight. Empirically, the PCA methods seem promising: Yi and Faloutsos demonstrate up to a ten times speedup over methods based on the discrete wavelet transform. Keogh et al. do not achieve similar speedups, but point to the fact that the structure allows for more flexible distance measures than many of the competing methods. Keogh et al. (2001a) later propose an improved version of the PCA, the so-called Adaptive Piecewise Constant Approximation, or APCA. This is Fig. 8. A sequence reconstructed from a PCA signature.
  • 51. 34 M. L. Hetland similar to the PCA, except that the segments need not be of equal length. Thus regions with great fluctuations may be represented with several short segments, while reasonably featureless regions may be represented with fewer, long segments. The main contribution of this representation is that it is a more effective compression than the PCA, while still representing the original faithfully. Two distance measures are developed for the APCA, one which is guar- anteed to underestimate Euclidean distance, and one which can be cal- culated more efficiently, but which may generate some false dismissals. It is also shown that this technique, like the PCA, can handle arbitrary Lp norms. The empirical data suggest that the APCA outperforms both meth- ods based on the discrete Fourier transform, and methods based on the dis- crete wavelet transform with a speedup of one to two orders of magnitude. In a recent paper, Keogh (2002) develops a distance measure that is a lower bound for dynamic time warping, and uses the PCA approach to index it. The distance measure is based on the assumption that the allowed warping is restricted, which is often the case in real applications. Under this assumption, Keogh constructs two warped versions of the sequence to be indexed: An upper and a lower limit. The PCA signatures of these limits are then extracted, and together with Keogh’s distance measure form an exact index (one with no false dismissals) with high precision. Keogh performs extensive empirical experiments, and his method clearly outperforms any other existing method for indexing time warping. 3.4. Landmark Methods In 1997, Keogh and Smyth introduce a probabilistic method for sequence retrieval, where the features extracted are characteristic parts of the sequence, so-called feature shapes. Keogh (1997) uses a similar landmark based technique. Both these methods also use the dimensionality reduction technique of piecewise linear approximation (see Section 4.2) as a prepro- cessing step. The methods are based on finding similar landmark features (or shapes) in the target sequences, ignoring shifting and scaling within given limits. The technique is shown to be significantly faster than sequen- tial scanning (about an order of magnitude), which may be accounted for by the compression of the piecewise linear approximation. One of the contribu- tions of the method is that it is one of the first that allows some longitudinal scaling. A more recent paper by Perng et al. (2000) introduces a more general landmark model. In its most general form, the model allows any point of
  • 52. A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 35 Fig. 9. A landmark approximation. great importance to be identified as a landmark. The specific form used in the paper defines an nth order landmark of a one-dimensional function to be a point where the function’s nth derivative is zero. Thus, first-order landmarks are extrema, second-order landmarks are inflection points, and so forth. A smothing technique is also introduced, which lets certain landmarks be overshadowed by others. For instance, local extrema representing small fluctuations may not be as important as a global maximum or minimum. Figure 9 shows an approximated time sequence, reconstructed from a twelve-dimensional landmark signature. One of the main contributions of Perng et al. (2000) is to show that for suitable selections of landmark features, the model is invariant with respect to the following transformations: • Shifting • Uniform amplitude scaling • Uniform time scaling • Non-uniform time scaling (time warping) • Non-uniform amplitude scaling It is also possible to allow for several of these transformations at once, by using the intersection of the features allowed for each of them. This makes the method quite flexible and robust, although as the number of transformations allowed increases, the number of features will decrease; consequently, the index will be less precise. A particularly simple landmark based method (which can be seen as a special case of the general landmark method) is introduced by Kim et al. (2001). They show that by extracting the minimum, maximum, and the first and last elements of a sequence, one gets a (rather crude) signature that is invariant to time warping. However, since time warping distance does not obey the triangle inequality [Yi et al. (1998)], it cannot be used directly. This problem is solved by developing a new distance measure that underestimates the time warping distance while simultaneously satisfying
  • 53. 36 M. L. Hetland the triangle inequality. Note that this method does not achieve results com- parable to those of Keogh (2002). 4. Other Approaches Not all recent methods rely on spatial access methods. This section contains a sampling of other approaches. 4.1. Using Suffix Trees to Avoid Redundant Computation Baeza-Yates and Gonnet (1999) and Park et al. (2000) independently intro- duce the idea of using suffix trees [Gusfield (1997)] to avoid duplicate cal- culations when using dynamic programming to compare a query sequence with other sequences in a database. Baeza-Yates and Gonnet use edit dis- tance (see Appendix for details), while Park et al. use time warping. The basic idea of the approach is as follows: When comparing two sequences x and y with dynamic programming, a subtask will be to compare their prefixes x1:i and y1:j. If two other sequences are compared that have identical prefixes to these (for instance, the query and another sequence from the database), the same calculations will have to be performed again. If a sequential search for subsequence matches is performed, the cost may easily become prohibitive. To avoid this, all the sequences in the database are indexed with a suffix tree. A suffix tree stores all the suffixes of a sequence, with identical pre- fixes stored only once. By performing a depth-first traversal of the suffix tree one can access every suffix (which is equivalent to each possible subse- quence position) and backtrack to reuse the calculations that have already been performed for the prefix that the current and the next candidate sub- sequence share. Baeza-Yates and Gonnet assume that the sequences are strings over a finite alphabet; Park et al. avoid this assumption by classifying each sequence element into one of a finite set of categories. Both methods achieve subquadratic running times. 4.2. Data Reduction through Piecewise Linear Approximation Keogh et al. have introduced a dimensionality reduction technique using piecewise linear approximation of the original sequence data [Keogh (1997), Keogh and Pazzani (1998), Keogh and Pazzani (1999a), Keogh and Pazzani (1999b), Keogh and Smyth (1997)]. This reduces the number of data
  • 54. A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences 37 points by a compression factor typically in the range from 10 to 600 for real data [Keogh (1997)], outperforming methods based on the Discrete Fourier Transform by one to three orders of magnitude [Keogh and Pazzani (1999b)]. This approximation is shown to be valid under several distance measures, including dynamic time warping distance [Keogh and Pazzani (1999b)]. An enhanced representation is introduced in [Keogh and Pazzani (1998)], where every line segment in the approximation is augmented with a weight representing its relative importance; for instance, a combined sequence may be constructed representing a class of sequences, and some line segments may be more representative of the class than others. 4.3. Search Space Pruning through Subsequence Hashing Keogh and Pazzani (1999a) describe an indexing method based on hashing, in addition to the piecewise linear approximation. An equi-spaced template grid window is moved across the sequence, and for each position a hash key is generated to decide into which bin the corresponding subsequence is put. The hash key is simply a binary string, where 1 means that the sequence is predominantly increasing in the corresponding part of the template grid, while 0 means that it is decreasing. These bin keys may then be used during a search, to prune away entire bins without examining their contents. To get more benefit from the bin pruning, the bins are arranged in a best-first order. 5. Conclusion This chapter has sought to give an overview of recent advances in the field of similarity based retrieval in time sequence databases. First, the problem of similarity search and the desired properties of robust distance measures and good indexing methods were outlined. Then, the general approach of signa- ture based similarity search was described. Following the general descrip- tion, three specific signature extraction approaches were discussed: Spectral signatures, based on Fourier components (or wavelet components); piece- wise constant approximation, and the related method adaptive piecewise constant approximation; and landmark methods, based on the extraction of significant points in a sequence. Finally, some methods that are not based on signature extraction were mentioned. Although the field of time sequence indexing has received much atten- tion and is now a relatively mature field [Keogh et al. (2002)] there are still areas where further research might be warranted. Two such areas are (1) thorough empirical comparisons and (2) applications in data mining.
  • 55. Exploring the Variety of Random Documents with Different Content
  • 56. Auf dem von A r i s t o t e l e s (Histor. animal. 8, 28) überlieferten Sprichworte: ἀεὶ φέρει τι Λιβύη καινόν, immer bringt Afrika etwas Neues beruht: Quid novi ex Africa? Was giebt es Neues aus Afrika? (vrgl. A r i s t o t. de generat. animal. 2, 5, A n a x i l a s, Komödiendichter um 350 v. Chr. bei A t h e n. 14, p. 623 E., P l i n. Nat. hist. 8, 17: vulgare Graeciae dictum: semper aliquid novi Africam afferre und N i c e p h o r u s G r e g o r a s [um 1350] Histor. Byzant., p. 805, 23, ed. Schopen).— A r i s t o t e l e s (de anima 3, 4) sagt: ὥσπερ ἐν γραμματείῳ ᾧ μηδὲν ὑπάρχει ἐντελεχείᾳ γεγραμμένον (wie auf einer Tafel, auf der wirklich nichts geschrieben ist). Hierzu fügt Trendelenburg das Wort A l e x a n d e r s a u s A p h r o d i s i a s (um 200 v. Chr.): ὁ νοῦς ... ἐοικὼς πινακίδι ἀγράφῳ (die Vernunft, einer unbeschriebenen Tafel gleichend), das P l u t a r c h Aussprüche d. Philos. 4, 11 (χαρτίον, Blatt für Tafel setzend) den Stoikern zuschrieb. Wir citieren lateinisch Tabula rasa, abgewischte Schreibtafel; was nach Prantl (Gesch. d. Logik) zuerst bei Ä g i d i u s a C o l u m n i s († 1316) vorkommt. Tabellae rasae lesen wir zwar schon bei O v i d (Ars Amandi 1, 437) aber ohne jene Beziehung auf Geistiges.— A r i s t o t e l e s (Problemata 30, 1) fragt: Διὰ τί πάντες ὅσοι περιττοὶ γεγόνασιν ἄνδρες, ἢ κατὰ φιλοσοφίαν, ἢ πολιτικὴν, ἢ ποίησιν, ἢ τέχνας, φαίνονται μελαγχολικοὶ ὄντες ... Woher kommt es, dass all' die Leute, die sich in der Philosophie, oder in der Politik, oder in der Poesie, oder in den Künsten auszeichneten, offenbar Melancholiker sind? Hieraus bildete Seneca (de tranquill, anim. 17, 10) den uns geläufigen Satz: Nullum magnum ingenium sine mixtura dementiae fuit.
  • 57. Es hat keinen grossen Geist ohne eine Beimischung von Wahnsinn gegeben.— Im A r i s t o t e l e s (Oekonom. 1, 6) lesen wir: Καὶ τὸ τοῦ Πέρσου, καὶ τὸ Λίβυος ἀπόφθεγμα εὖ ἂν ἔχοι· ὁ μὲν γὰρ ἐρωτηθεὶς τί μάλιστα ἵππον πιαίνει, ὁ τοῦ δεσπότου ὀφθαλμὸς ἔφη· ὁ δὲ Λίβυος, ἐρωτηθεὶς ποία κόπρος ἀρίστη, τὰ τοῦ δεσπότου ἴχνη, ἔφη. Sowohl des Persers, wie des Libyers Ausspruch ist gut, denn Jener sagte auf die Frage, was ein Pferd am Besten mäste: Das Auge des Herrn; während der Libyer auf die Frage, welcher Dünger am Besten sei, sagte: des Herrn Fussstapfen. C o l u m e l l a (4, 18) vermengt diese Worte, indem er schreibt: oculos et vestigia domini res agro saluberrimas, die Augen und Fussstapfen des Herrn seien die heilsamsten Dinge für den Acker, und P l i n i u s (Nat. hist., 18, 2) kürzt dies also: majores fertilissimum in agro oculum domini esse dixerunt.—Die Altvordern sagten, am fruchtbringendsten für den Acker sei das Auge des Herrn.— Im A r i s t o t e l e s (Analyt. prior. B. 18 p. 66 ed. Bekker) steht: Ὁ δὲ ψευδὴς λόγος γίνεται παρὰ τὸ πρῶτον ψεῦδος, der falsche Satz entspringt dem falschen Grundgedanken oder die falsche Conclusion der falschen Prämisse. Hieraus stammt für Grundirrtum Das πρῶτον ψεῦδος, das wir jedoch nach dem Sprachgebrauch, der ψεῦδος nicht als Irrtum sondern als absichtliche Täuschung nimmt, oft als Grundbetrug oder Urlüge aufzufassen und theologisch anzuwenden geneigt sind.—
  • 58. Theophrast (um 372-287 v. Chr.) pflegte (nach Diogen. Laërt. V. 2 n. 10, 40) zu sagen: πολυτελὲς ἀνάλωμα εἶναι τὸν χρόνον, Zeit sei eine kostbare Ausgabe. Hieraus scheint hergeleitet: Zeit ist Geld, was wir auch englisch ausdrücken: Time is money. In Bacons Essayes (Of Dispatch 1620) heisst es: Time is the measure of business, as money is of wares: and business is bought at a deare hand, where there is small dispatch (Zeit ist der Arbeitmesser, wie Geld der Waarenmesser ist: und Arbeit wird teuer, wenn man nicht sehr eilt).— Der Redner Pytheas (um 340 v. Chr.) sagte (nach Plutarch Staatslehren 6 n. Demosthenes 8, sowie nach Aelian variae hist. 7, 7) von den Reden des von ihm unaufhörlich angefeindeten Demosthenes, dass sie nach Lampendochten röchen (ἐλλυχνίων ὄζειν) und noch heute sagen wir nach der Lampe riechen von jeder litterarischen Arbeit, welche ohne Anmut der Form nächtliches Studium verrät.— Bei S t o b ä u s (Serm. LXVI, p. 419. Gesn.) finden wir des Menander (342-290 v. Chr.): Τὸ γαμεῖν, ἐάν τις τὴν ἀλήθειαν σκοπῇ, Κακὸν μέν ἐστιν, ἀλλ' ἀναγκαῖον κακόν. Heiraten ist, wenn man die Wahrheit prüft, Ein Übel, aber ein notwendiges Übel.
  • 59. M a l u m n e c e s s a r i u m, die lat. Übersetzung, steht in des L a m p r i d i u s (4. Jahrh. n. Chr.) Alexander Severus 46.— P l u t a r c h überliefert uns in der Trostrede an Apollonius, dessen Sohn gestorben war, (p. 119e ; cap. 34) den Vers des M e n a n dn: Ὃν οἱ θεοὶ φιλοῦσιν ἀποθνήσκει νέος, den P l a u t u s (Bacch. 4, 7, 18) also übersetzt: quem di diligunt adolescens moritur und der bei uns zu lauten pflegt: Wen die Götter lieben, der stirbt jung.— M e n a n d e r s Wort ἀνεῤῥίφθω κύβος (der Würfel falle!—Überl. v. Athenäus XIII, p. 559 c.) citierte C ä s a r, als er 49 v. Chr. den Rubicon überschritt, in griechischer Sprache, wie Plutarch (Pompeius, 60 und Ausspr. v. Kön. u. Feldh.) ausdrücklich hervorhebt. Sueton hingegen lässt ihn lateinisch sagen (Caesar 32):
  • 60. Alea iacta est! Der Würfel ist gefallen! (Erasmus verbessert: Iacta esto alea! Der Würfel falle!) Huttens Wahlspruch (s. Kap. III) Jacta est alea hat hier seine Quelle.— Die 422. Gnome der Monostichen des M e n a n d e r Ὁ μὴ δαρεὶς ἄνθρωπος οὐ παιδεύεται Wer nicht geschunden wird, wird nicht erzogen stellte G o e t h e als Motto vor den 1. Teil seiner Selbstbiographie.— Eine Komödie M e n a n d e r s Ἑαυτὸν τιμωρούμενος kam auf uns durch des Te r e n z Komödie Heautontimorumenos, Der Selbstpeiniger. Die nach D i o g e n e s L a ë r t i u s (VII, 1 n. 19, 23) von dem Stoiker Zeno (geb. 340 v. Chr.) aufgestellte (von P o r p h y r i u s im Leben des Pythagoras aber auf diesen zurückgeführte, in P l u t a r c h s Schrift Die Menge der Freunde und in dem P s e u d o - A r i s t o t e l i s c h e n Buch Magna Moralia II, 15 citierte) Definition des Freundes Ἄλλος ἐγώ wenden wir an in der lateinischen und deutschen Form: Alter ego, Ein zweites Ich. Bei C i c e r o findet sich me alterum ad. fam. 7, 5, 1; ad Attic. 3, 15, 4; 4, 1, 7; Alterum me ad fam. 2, 15, 4; verus amicus est tanquam alter idem de amic. 21, 80; bei Ausonius alter ego praef. 2, 42 (4. Jahrh. n. Chr.). Der griechische Romanschreiber E u s t a t h i u s [6. Jahrh.? 12. Jahrh.?] sagt dreist von sich: Ein zweites Ich; denn also bezeichne ich den Freund. H e r c h e r Erotici
  • 61. Graeci 2, p. 164, 25; vrgl. 165, 18. Späterhin nahm Alter ego die Bedeutung eines Stellvertreters der souveränen Gewalt an.— Am Schlusse jeder Beweisführung des Mathematikers Euklid (bl. um 300 v. Chr.) heisst es: ὅπερ ἔδει δεῖξαι, quod erat demonstrandum, was zu beweisen war.— Des (um 270 v. Chr. bl.) Philosophen Bion Witz: Εὔκολον τὴν εἰς Ἅιδου ὁδόν· καταμύοντας γοῦν κατιέναι, der Weg zum Hades ist leicht; man kommt ja mit geschlossenen Augen hinab (s. Diog. Laërt. IV, c. 7, n. 3, § 49) wird von uns in der kürzeren Form des Vergil citiert (Aen. 6, 126): Facilis descensus Averno, Das Hinabsteigen in die Unterwelt ist leicht; worauf dann folgt, dass das Wiederauftauchen daraus schwer sei.— Philo Judaeus († 54 n. Chr.) sagt (de migr. Abrahami 15, p. 449, Mangey) von den ägyptischen Zauberern: ἀπατᾶν δοκοῦντες ἀπατῶνται (sie glaubten zu betrügen und wurden betrogen). Danach schreibt der gern citierende Apostel P a u l u s im 2. Briefe an Thimotheus 3, 13 auch von den Magiern Ägyptens: Mit den bösen Menschen aber und verführerischen wird es je länger je ärger, verführen und werden verführt (πλανῶντες καὶ πλανώμενοι). Dann sagt P o r p h y r i u s in seines Lehrers Plotin Leben (16): οἳ— ἐξηπάτων καὶ αὐτοὶ ἠπατημένοι (die betrogen und selbst betrogen waren) und A u g u s t i n u s (Bekenntnisse 7, 2): deceptos illos et deceptores, und G. E. L e s s i n g (Nathan 3, 7) verdeutschte in der Parabel von den drei Ringen das Wort also:
  • 62. Betrogene Betrüger. (vrgl. M a r g a r e t e v o n N a v a r r a in dem 1543 erschienenen Heptameron Novelle 1, 6, 15, 23, 25, 28, 45, 51, 62; C a r d a n u s († 1576) De subtilitate, 1663, III, 551; C e r v a n t e s Don Quijote 2, 33 (1615) u. s. w.; M o s e s M e n d e l s s o h n (Ges. Schr., 1843, III, 115; Brief vom 9. 2. 1770 an Bonnet über eine Sekte): Wollen wir sagen, dass alle ihre Zeugen Betrogene und Betrüger sind? Eine komische Oper von Guilet et Gaveaux (1799) heisst Le trompeur trompé.)— Flavius Josephus (37 n. Chr.—nach 93) sagt in seiner Schrift Gegen Apion (II, 16) von Moses im Gegensatze zu Minos: Ὁ δὲ ἡμετέρος νομοθέτης εἰς μὲν τούτων οὐδοτιοῦν ἀπεῖδεν, ὡς δ' ἄν τις εἴποι βιασάμενος τὸν λόγον, θεοκρατίαν ἀπέδειξε τὸ πολίτευμα, Θεῷ τὴν ἀρχὴν καὶ τὸ κράτος ἀναθείς—Unser Gesetzgeber richtete jedoch auf Alles Dieses gar nicht sein Augenmerk; er machte die Staatsverfassung zu einer Theokratie (Gottesherrschaft), wenn man sich so gewaltsam ausdrücken darf, indem er Gott die obrigkeitliche Macht beilegte.— Einen Spruch des Epiktet (geb. um 50 n. Chr.) teilt A u l u s G e l l i u s 17, 19, 6 in der lateinischen Form mit: Sustine et abstine, ἀνέχου καὶ ἀπέχου, Leide und meide.— Plutarch (geb. um 50 n. Chr., † 120 n. Chr.) erzählt in seiner Biographie des L. A e m i l i u s P a u l l u s (Kap. 5), dass dieser sich aus unbekannten Gründen von seiner Gattin, Papiria, habe scheiden lassen. Plutarch vermutet, dass der Scheidungsgrund ein ähnlicher
  • 63. gewesen sei, wie derjenige eines gewissen Römers. Dieser habe sein Weib fortgeschickt und alsdann auf die Fragen seiner Freunde: Ist sie denn nicht sittsam? Nicht schön von Gestalt? Schenkte sie Dir denn keine Kinder? ihnen seinen Schuh hingestreckt und gefragt: Ist er nicht fein? Ist er nicht neu? Aber Niemand von Euch sieht, an welcher Stelle mein Fuss gedrückt wird, (οὐκ ἂν εἰδείη τὶς ὑμῶν. καθ' ὅτι θλίβεται μέρος οὑμὸς πούς). Hierauf fusst die Stelle des H i e r o n y m u s (adv. Jovin. 1, 48): Legimus quendam apud Romanos nobilem, cum eum amici arguerent, quare uxorem formosam et castam et divitem repudiasset, protendisse pedem et dixisse eis: Et hic soccus, quem cernitis, videtur vobis novus et elegans, sed nemo scit praeter me, u b i m e p r e m a t. Hier findet sich zuerst das bekannte Bild unseres Sprachschatzes: Nicht wissen und wissen, wo Einen der Schuh drückt.— Durch Lucians (um 160 n. Chr.) Abhandlung wie man Geschichte schreiben müsse wurde die thracische Stadt Abdera für immer als lächerlich gebrandmarkt; und sie wurde als solche in Deutschland berühmt durch W i e l a n d s im teutschen Merkur 1774, 1. und 2. erschienene Geschichte der Abderiten.— Bei Sextus Empiricus (Ende des 2. Jahrh. n. Chr.; Adversus mathematicos, 287; Imm. Bekker, Berl. 1842; S. 665) steht: ὀψὲ θεῶν ἀλέουσι μύλοι, ἀλέουσι δὲ λεπτά. Lange zwar mahlen die Mühlen der Götter, doch mahlen sie Feinmehl. (Ähnlich in Orac. Sibyll. 8, 14. ed. Friedlieb, Lpz. 1852.)
  • 64. In Eiseleins Sprichwörtern wird das Wort ohne jeglichen Beleg auf P l u t a r c h zurückgeführt. S e b a s t i a n F r a n c k (Sprichwörter, 1541, II, 119b ) führt an: Sero molunt deorum molae, Gottes Mühl stehet oft lang still und die Götter mahlen oder scheren einen langsam, aber wohl, ferner einige Zeilen weiter unten Der Götter Mühl machen langsam Mehl, aber wohl, und L o g a u (1654) III, 2, 24 macht daraus: Gottes Mühlen mahlen langsam, mahlen aber trefflich klein. (Ob aus Langmut er sich säumet, bringt mit Schärf er alles ein.) Daraus dürfte die bekannte Redensart: Langsam, aber sicher entstanden sein.— Plotin ( † 270 n. Chr.) bereichert unsere Sprache um zwei geflügelte Worte. Wir lesen bei ihm (Enn. I, 6 p. 57; Ausg. v. Kirchhoff I, S. 12): οὐ γὰρ πώποτε εἶδεν ὀφθαλμὸς ἥλιον, ἡλιοειδὴς μὴ γεγενημένος, οὐδὲ τὸ καλὸν ἂν ἴδοι ψυχὴ μὴ καλὴ γενομένη, Nie hätte das Auge je die Sonne gesehen, wäre es nicht selbst sonnenhafter Natur; und wenn die Seele nicht schön ist, kann sie das Schöne nicht sehen. Hieraus stammt Schöne Seele und der G o e t h esche Vers (1823. Zahme Xenien. Bd. 3): Wär' nicht das Auge sonnenhaft, Die Sonne könnt' es nie erblicken. Mit diesem Gedanken lehnte P l o t i n sich an P l a t o an, der in seinem Staat p. 508 sagt: Das Gesicht ist nicht die Sonne . . . aber das sonnenähnlichste . . . unter allen Werkzeugen der Wahrnehmung, und der ebenda weiter unten Erkenntnis und Wahrheit, wie Licht und Gesicht, für sonnenartig erklärt.—
  • 65. Julianus Apostata (331-363 n. Chr.) meint (oratio VI ed. Ez. Spanhemius, 1696, p. 184), es dürfe nicht Wunder nehmen, dass wir zu der, gleich der Wahrheit, einen und einzigen Philosophie auf den verschiedensten Wegen gelangen. Denn auch wenn Einer nach Athen reisen wolle, so könne er dahin segeln oder gehen und zwar könne er als Wanderer die Heerstrassen benutzen oder die Fusssteige und Richtwege und als Schiffer könne er die Küsten entlang fahren oder wie Nestor das Meer durchschneiden. Damals galt noch Athen als Ziel der Gebildeten, später wurde es Rom. Es führen viele Wege nach Athen liegt im obigen Satz und mochte sich in das uns geläufige Wort verwandeln: Es führen viele Wege nach Rom, wofür jedoch sichere Belege noch zu suchen sind.— Proclus (412, 485 n. Chr.) nennt in seinem Commentar zu Platos Timaeus (154c) den οὐρανός (Himmel) die πέμπτη οὐσία Quintessenz (Das fünfte Seiende) und auch in dem Leben des Aristoteles von A m m o n i u s (Westermann, vitarum scriptores Graeci minores, 1845, p. 401) wird die εʹ οὐσία erwähnt. Damit ist nach Aristoteles (De mundo, Kap. 2) der Äther gemeint, der dort ein anderes Element als die vier, ein göttliches, unvergängliches genannt wird. (Aristot. Meteor. 1, 3; de coelo, 1, 3; de gen. an., 2, 3.) Proclus ist die Quelle für das Wort. Viel später jedoch wurde der heut damit verknüpfte Begriff des feinsten Extrakts, der innersten Wesenheit oder des Kerns einer Sache in dies Wort hineingelegt. R a i m u n d u s L u l l u s gab 1541 sein Buch De secretis naturae sive Quinta essentia heraus, in dem er zu Anfang des zweiten Teiles diese Quintessenz als Allheilmittel preist, und 1570 erschien
  • 66. Leonhart T h u r n e y s s e r zum Thurns Quinta essentia, das ist die höchste Subtilitet, Krafft und Wirkung . . . . der Medicina und Alchemia . . . . In der Vorrede stellt er die Quinta Essentz Olea neben den Stein der Weisen, den lapis philosophorum. Im 13. Buch nennt er sich einen Schüler des Theophrastus P a r a c e l s u s, der also der Vater des Schwindels mit der Quintessenz sein wird, wie er so manchen anderen Schwindels Vater gewesen ist.—
  • 67. XI. Geflügelte Worte aus lateinischen Schriftstellern. [63] [63] Aus diesem Kapitel (15. Aufl.) ging A. O t t o's Werk hervor: Die Sprichwörter und sprichwörtlichen Redensarten der Römer (Lpzg., Teubner, 1890), eine vortreffliche Arbeit, der dieses Buch manchen wertvollen Aufschluss verdankte. Jeder ist seines Glückes Schmied ist nach der dem S a l l u s t zugeschriebenen Schrift de republica ordinanda 1, 1, wo es heisst: quod in carminibus Appius ait, fabrum esse suae quemque fortunae, auf A p p i u s Claudius (Consul 307 v. Chr.) zurückzuführen. P l a u t u s (Trin. 2, 2, 84: sapiens ipse fingit fortunam sibi) schreibt diese Fähigkeit nur dem Weisen zu; während ein von Cornelius N e p o s (Atticus 11, 6) mitgeteilter Jambus eines Unbekannten wiederum aussagt: Sui cuique mores fingunt fortunam (hominibus). Jedes Menschen Glück schmiedet ihm sein Charakter.— Als Citatenquelle ist Plautus (um 254-184 v. Chr.) zu erwähnen mit: Nomen atque omen, Name und zugleich Vorbedeutung, aus dem Persa, 4, 4, 74, und mit dem ebenda 4, 7, 19 vorkommenden, von Te r e n z im Phormio 3, 3, 8 angewendeten Sapienti sat (est)!
  • 68. Für den Verständigen genug! (d. h. für ihn bedarf es keiner weiteren Erklärung).— Oleum et operam perdidi Öl und Mühe habe ich verschwendet kommt in des P l a u t u s Poenulus 1, 2, 119 vor und wird dort von einer Dirne gebraucht, die sich vergebens hat putzen und salben lassen. C i c e r o überträgt es auf Gladiatoren (Ad familiares 7, 1); dann wird damit auf das verschwendete Öl der Studierlampe angespielt (Cicero Ad Atticum 13, 38; Iuvenal 7, 99).— Allgemein bekannt ist auch des P l a u t u s Komödientitel Miles gloriosus Der ruhmredige Kriegsmann. Das Original dieses Stückes war von einem uns unbekannten griechischen Dichter und hiess Ἀλαζών (der Marktschreier, Aufschneider, Gloriosus), wie P l a u t u s (2, 1, 8 u. 9) selbst bezeugt.— Summa summarum, Alles in allem, finden wir zuerst bei P l a u t u s (Truculentus 1, 1, 4).— Im Trinummus (5, 2, 30) des P l a u t u s heisst es: Tunica propior pallio. Das Hemd ist mir näher als der Rock.— Bei P l a u t u s (Stichus 5, 4, 52 Casina 2, 3, 32) kommt Ohe iam satis! Oh, schon genug!
  • 69. vor, das sich auch bei Horaz (Sat. 1, 5, 12) und M a r t i a l (4, 91, 6 u. 9) findet.— Ennius (239-169 v. Chr.) wird in C i c e r o s Laelius 17, 64 citiert mit: Amicus certus in re incerta cernitur, Den sicheren Freund erkennt man in unsicherer Sache.— Schon E u r i p i d e s (Hec. 1226) sagt ähnlich: Ἐν τοῖς κακοῖς γὰρ οἱ ἀγαθοὶ σαφέστατοι Φίλοι. Denn in der Not sind gute Freund' am sichersten.— In 1, 1, 99 der Andria des Terenz (185-155 v. Chr.) erzählt Simo, wie er sich erst über des Sohnes Pamphilus Thränen beim Begräbnis einer Nachbarin gefreut, dann aber der Verstorbenen hübsche Schwester unter den Leidtragenden bemerkt habe . . . . Das fiel mir gleich auf. Haha! Das ist's! Hinc illae lacrumae! Daher jene Thränen! Dies Wort wird bereits von C i c e r o (pro Caelio, c. 25) und von H o r a z (Epistel 1, 19, 41) citiert.— Aus 1, 2, 23 der Andria des Te r e n z ist die Antwort des Davus: Davus sum, non Oedipus, Davus bin ich, nicht Ödipus, d. h. ich verstehe dich nicht, denn ich kann nicht so geschickt Rätsel lösen wie Ödipus.— Aus der Andria 1, 3, 13: Inceptio est amentium, haud amantium,
  • 70. Ein Beginnen von Verdrehten ist's, nicht von Verliebten, ist in den Gebrauch übergegangen: Amantes, amentes, Verliebt, verdreht, was wohl zuerst in dem Titel des 1604 in 3. Auflage erschienenen Lustspiels Amantes amentes von G a b r i e l R o l l e n h a g e n vorkommt. Amens amansque (verdreht und verliebt) findet sich übrigens schon bei P l a u t u s Merc. Prolog. 81.— Aus der Andria 2, 1, 10 und 14 ist: Tu si hic sis, aliter sentias, Wärst du an meiner Stelle, du würdest anders denken; Interim fit (eigentlich: fiet) aliquid; Unterdessen wird sich schon irgend etwas ereignen; (in des Plautus Mercator 2, 4, 24 heisst es: aliquid fiet).— Aus 3, 3, 23 sind die Worte: Amantium irae amoris integratio (est) Der Liebenden Streit die Liebe erneut, eine Verschönerung des Menandrischen ὀργὴ φιλούντων μικρὸν ἰσχύει χρόνον, Nicht lange währt der Zorn der Liebenden (s. Stobäus Serm. LXI, p. 386.11); aus 4, 1, 12: proximus sum egomet mihi, Jeder ist sich selbst der Nächste.— Aus dem Eunuch (Prolog 41) des Te r e n z stammt: Nullum est iam dictum, quod non sit dictum prius, Es giebt kein Wort mehr; das nicht schon früher gesagt ist; (s. Goethe: Wer kann was Dummes . . .)—
  • 71. Aus 4, 5, 6 kommt uns das damals schon sprichwörtliche Sine Cerere et Libero friget Venus Ohne Ceres und Bacchus bleibt Venus kalt. Bereits E u r i p i d e s sagte (Bacchae, 773): οἴνου δὲ μηκέτ' ὄντος, οὐκ ἔστιν Κύπρις. Wo's keinen Wein mehr giebt, giebt's keine Liebe.— In des Te r e n z Heautontimorumenos (s. auch unter: Menander) 1, 1, 25 heisst es: Homo sum; humani nihil a me alienum puto, Mensch bin ich; nichts, was menschlich, acht' ich mir als fremd. Es liegt hier wohl zweifellos die Übersetzung eines, schon im Menanderschen Original befindlich gewesenen Wortes vor.— Aus des Te r e n z Adelphi 4, 1, 21 citieren wir den erschreckten Ruf des Syrus, als er Ctesiphos Vater plötzlich erblickt, über den er gerade mit jenem spricht: Lupus in fabula! (C i c e r o ad. Attic. 13, 33 wendet das Wort an, das schon bei P l a u t u s Stich. 4, 1, 71 in der Form ecce tibi lupum in sermone vorkommt.) Zu übersetzen wäre: Wenn man vom Wolf spricht, ist er nicht weit; doch wollen andere Ausleger den Volksglauben der Alten hineinziehen, dass man beim Anblick eines Wolfes verstummen müsse (s. Voss z. Vergils Ecl. 9, 54 u. Meineke zu Theokrits Id. 14, 22), da ja auch die plötzliche Ankunft dessen, von dem wir reden, uns verstummen mache.— Adelphi 4, 7, 21-23 heisst es: Ita vita est hominum, quasi, cum ludas tesseris; Si illud, quod maxume opus est iactu, non cadit,
  • 72. Illud quod cecidit forte, id arte ut corrigas. So gleicht des Menschen Leben einem Würfelspiel: Wenn just der Wurf, den man am meisten braucht nicht fällt, So korrigiert man, was der Zufall gab, durch Kunst. Aus dieser Stelle stammt corriger la fortune das Glück verbessern, d. h. falsch spielen, was sich in H a m i l t o n s 1713 erschienenen Mém. d. Grammont K. 2, in P r é v o s t s Manon Lescaut (1743) 27, 1 und auch in L e s s i n g s Minna von Barnhelm (1767) 4, 2 findet. M o l i è r e (1663 L'École des Femmes 4, 8) hat corriger le hazard beim Würfelspiel, aber durch bonne conduite. In R e g n a r d s Le Joueur (1696) 1, 10 weiss Toutabas, wenn's sein muss, par un peu d'artifice d'un sort injurieux corriger la malice; und in G. F u r q u h a r s Sir Harry W i l d a i r (1701) Akt 3 z. A. sagt Monsieur Marquis in seinem Kauderwelsch: Fortune give de Anglis Man de Riches, but Nature give de France Man de Politique to correct unequal Distribution.— Duo cum faciunt idem, non est idem, Wenn zwei dasselbe thun, so ist es nicht dasselbe, ist eine Verkürzung der Stelle Adelphi 5, 3, 37: Duo cum idem faciunt, . . ., Hoc licet impune facere huic, illi non licet. Wenn zwei dasselbe thun, . . . so darf der Eine es ungestraft thun, der Andere nicht.— Aus des Te r e n z Phormio 1, 2, 18 stammt: Montes auri pollicens; Berge Goldes (goldene Berge) versprechen(d). Wenn G e o r g E b e r s (Ägypten in Bild und Wort S. 17) den Komödiendichter M e n a n d e r aus Athen an seine Geliebte schreiben lässt: Ich habe von Ptolomäus . . . Briefe . . ., in denen er mir mit königlicher Freigebigkeit g o l d e n e
  • 73. B e r g e verspricht, so ist dies nur eine freie Übersetzung von τῆς γῆς ἀγαθά, die Güter der Erde. In des P l a u t u s Miles gloriosus 4, 2, 73 kommen aber schon argenti montes, Berge von Silber, vor und im Stichus 1, 1, 24-5 heisst es: Neque ille sibi mereat Persarum montes, qui esse aurei perhibentur, Und er möchte sich die Perserberge nicht erwerben, die von Gold sein sollen. Auch V a r r o (bei Nonius p. 379) singt von diesen Perserbergen: Non demunt animis curas ac religiones Persarum montes, non atria divitis Crassi; Weder die Berge der Perser, noch Hallen des prunkenden Crassus Können die Herzen befreien von Angst und von nagenden Skrupeln; während der Perserkönig im A r i s t o p h a n e s (Acharn. 81) nach achtmonatlichem Sitzen auf goldenen Bergen (ἐπὶ χρυσῶν ὀρῶν) eine Befreiung anderer Art fand. Es scheint, als deute unser Gudrunepos (vor 1200) mit seinem (V. 493) und waere ein berc golt, den naeme ich niht dar umbe auf eine gemeinsame indogermanische Quelle.— Aus des Te r e n z Phormio 2, 2, 4 ist: Tute hoc intristi; tibi omne est exedendum, Du hast es eingerührt; Du musst es auch ganz ausessen; aus 2, 4, 14: Quot homines, tot sententiae, So viel Leute, so viel Ansichten, was schon C i c e r o (De fin. 1, 5, 15) anführt, (vrgl. unten: Horaz Sat. 2, 1, 27.)— Oderint, dum metuant, Mögen sie hassen, wenn sie nur fürchten, aus der Tragödie Atreus des Accius (170-104 v. Chr.), citierten bereits C i c e r o (1. Philipp. 14, 34, pr. Sest. 48, de offic. 1, 28) und S e n e c a (Üb. d. Zorn 1, 20, 4; Üb. d. Gnade 1, 12, 4 u. 2, 2, 2). Nach S u e t o n (Calig. 30) war es ein Lieblingswort des Kaisers Caligula.—
  • 74. Bei Lucilius († 103 v. Chr.) steht (ed. Lachmann, Berl. 1877, v. 2, ebenso bei P e r s i u s 1, 2): Quis leget haec? Wer wird das (Zeug) lesen?— Auch stammt nach M a c r o b i u s (Saturnalien, 6, 1, 35) non omnia possumus omnes wir können nicht Alle Alles von L u c i l i u s her und wurde von F u r i u s A n t i a s citiert. V e r g i l verwendete es Ecloge 8, 63. H o m e r mag des Gedankens Vater sein, denn, dass e i n e m Menschen nicht alle Gaben verliehen seien, spricht er öfters aus (s. Iliade 4, 320; 13, 729 u. Odyssee 8, 167).— Varro (116-27 v. Chr.) De lingua latina VII, 32 (n. Otfr. Müllers Ausg.) sagt: Sed canes, quod latratu signum dant, ut signa canunt, canes appellatae. Dies ist spöttisch umgestaltet worden zu: canis a non canendo Hund wird canis genannt, weil er nicht singt (non canit) (s. Quintilians lucus a non lucendo).— Auch citieren wir das von G e l l i u s (1, 22, 4 u. 13, 11, 1) als Titel einer V a r r onischen Schrift angeführte: Nescis, quid vesper serus vehat. Du weisst nicht, was der späte Abend bringt.— Cicero (106-43 v. Chr.) nennt pro Roscio Amerino, 29 die Mordgesellen, die zu Sullas Zeiten Gutsbesitzer ermordeten und dann deren Güter betrügerisch an sich zu bringen und vorteilhaft zu verschachern wussten:
  • 75. sectores collorum et bonorum, Halsabschneider und Güterschlächter.— Im Anfange der 1. Rede in Catilinam finden wir das auch bei Livius 6, 18 und bei Sallust Catilina 20, 9 vorkommende, ungeduldige Quousque tandem . . .? Wie lange noch . . .?— In Ciceros Catilina 1, 1 (vrgl. Martial IX, 71); IV, 25, 56, sowie pro rege Deiotaro 11, 31 und de domo sua 53, 137 steht: O tempora! O mores! O Zeiten! O Sitten! Im Hofmeister (1774) von R. Lenz citiert es (5, 10) der Schulmeister Wenzeslaus, und als Refrain von Geibels Lied vom Krokodil (1840) fand es die weiteste Verbreitung.— In C i c e r o s Catilina 2, 1 findet sich: Abiit, excessit, evasit, erupit. Er ging, er machte sich fort, er entschlüpfte, er entrann.— Videant consules ne quid res publica detrimenti capiat, Die Konsuln mögen dafür sorgen, dass die Republik keinen Schaden leidet bildete, seit man vom 6. Jahrh. an die Diktatur nicht mehr in Rom anwenden wollte, das sogenannte senatus-consultum ultimum, welches die Konsulargewalt zu einer diktatorischen machte (s. C i c e r o pr. Mil. 26, 70, in Catil. I, 2, 4, Phil. 5, 12, 34, Fam. 16, 11, 3; C ä s a r de bell. civ. 1, 5, 3; 1, 7, 4; Liv. 3, 4, S a l l u s t Catil. 29, P l u t a r c h C. Gracch. 14 u. Cic. 15.)— Aus C i c e r o s de fin. 5, 25, 74 stammt: Consuetudo (quasi) altera natura,
  • 76. Die Gewohnheit ist (gleichsam) eine zweite Natur; G a l e n u s (De tuenda valetudine, cap. 1) bietet die heute übliche Form: Consuetudo est altera natura. Schon in des A r i s t o t e l e s Rhetorik, 1370a 6 (Bekker) heisst es: die Gewohnheit ist der Natur gewissermassen ähnlich (τὸ εἰθισμένον ὥσπερ πεφυκὸς ἤδη γίγνεται).— In C i c e r o s Tuscul. 1, 17, 39 heisst es: Errare . . malo cum Platone, . . quam cum istis vera sentire, Lieber will ich mit Plato irren, als mit denen (den Pythagoreern) das Wahre denken.— Di minorum gentium (wörtlich: Götter aus den geringeren Geschlechtern) nennen wir die untergeordnete Schicht einer Klasse Menschen mit Beziehung auf das maiorum gentium di (d. h. die oberen zwölf Götter bei C i c e r o Tusc. 1, 13, 29), Bezeichnungen, die daraus entsprangen, dass Tarquinius ausser den von Romulus berufenen patres maiorum gentium (Senatoren aus den hervorragenden Geschlechtern) auch patres minorum gentium (Senatoren geringerer Herkunft) berief (vrgl. Cicero d. rep. 2, 20; Liv. 1, 35, 6 und dazu das Patrici minorum gentium bei Cic. Fam. 9, 21 und Liv. 1, 47, 7).— Aus C i c e r o s I. Philippica, 5, 11 und zugleich aus De finibus 4, 9, 22, (vrgl. Livius 23, 16 im Anfang, wo es in nicht übertragener Bedeutung steht) stammt die für eine den Staat bedrohende Gefahr gebräuchlich gewordene Wendung: Hannibal ad (nicht: ante) portas. Hannibal (ist) an den Thoren. Diese Redensart, wie die Erinnerung an Catilina und an das aus L i v i u s (XXI, 7: dum ea Romani parant consultantque, iam Saguntum summa vi oppugnabatur) geschöpfte Wort:
  • 77. Dum Roma deliberat, Saguntum perit, Während Rom beratschlagt, geht Sagunt zu Grunde, (auch in der Form: Roma deliberante Saguntum perit citiert) wurden von G o u p i l d e P r é f e l n in einer Sitzung der konstituierenden Versammlung von 1789 zu dem unrichtigen Citate vermischt: Catilina est aux portes, et l'on délibère. Er stichelte damit auf M i r a b e a u, der diesem Worte dadurch erst recht Bahn verschaffte, dass er es in seiner berühmten Rede zur Abwendung des Bankerotts wiederholte und variirte.— In C i c e r o s II. Philippica 14, 35, pro Milone 12, 32 und pro Roscio Amerino 30, 84 und 31, 86 wird das uns geläufige cui bono? (Wozu?) (A quoi bon?) eigentlich: Wem zum Nutzen? ausdrücklich als ein Wort des L. Cassius bezeichnet. Aus der zuletzt angeführten Stelle ersehen wir, dass L. Cassius, ein Mann von äusserster Strenge, bei den Untersuchungen über Mord den Richtern einschärfte, nachzuforschen, cui bono, wem zum Nutzen das Ableben des Ermordeten war.— Cicero spricht in seiner Rede pro Roscio Amer. 16, 47: Homines notos sumere odiosum est, cum et illud incertum sit, velintne hi sese nominari (angesehene Leute nennen, ist eine heikle Sache, da es auch zweifelhaft ist, ob sie selbst genannt werden wollen). Daher sagen wir, wenn es gescheidter ist, keine Namen zu nennen: Nomina sunt odiosa,
  • 78. Namen sind verpönt.— Aus C i c e r o s Rede pro Milone 4, 10 ist bekannt: Silent leges inter arma. Im Waffenlärm schweigen die Gesetze. L u c a n u s ahmt diese Worte (Pharsalia I, 277) also nach: Leges bello siluere coactae.— Die altrömische Formel des Richters, der nicht entscheiden kann, ob Schuld oder Unschuld vorliegt, das Non liquet citieren wir aus Cicero pro Cluentio 28, 76 (vrgl. Gellius 14, 2. g. E. und das liquet bei Cicero Caecin. 10; Quintilian Instit. 3, 6, 12): Deinde homines sapientes, et ex vetere illa disciplina iudiciorum, qui neque absolvere hominem nocentissimum possent, neque eum, de quo esset orta suspicio, pecunia oppugnatum, re illa incognita, primo condemnare vellent, n o n l i q u e r e dixerunt. Darauf gaben einsichtige Männer von der alten Schule der Geschwornengerichte, die weder solchen Verbrecher freisprechen konnten, noch ihn, gegen Den, wie man munkelte, mit Bestechung der Richter vorgegangen war, v o r Untersuchung dieser Sache im ersten Termin verurteilen wollten, folgenden Spruch ab: e s i s t n i c h t a u f g e k l ä r t.— Weil C i c e r o seine Reden gegen Antonius im Vergleich mit den gewaltigen Reden des D e m o s t h e n e s gegen Philipp von Macedonien Philippische nannte, so nennt man noch heute jede Donnerrede eine Philippika.— Der Titel der C i c e r onischen Rede de domo sua ist in der älteren Lesart pro domo für das eigene Haus
  • 79. zum allgemeinen Ausdruck für jede Thätigkeit geworden, die auf Erhaltung der eigenen Habe abzielt, und wir nennen danach eine der Selbstverteidigung oder dem eigenen Vorteil dienende Rede eine oratio pro domo.— Aus C i c e r o s (De harusp. respons. 20, 43) Redewendung: resistentem, longius, quam voluit, popularis aura provexit, Die Volksgunst trieb den Widerstrebenden weiter, als er wollte, stammt das später von Vergil, Horaz, Livius und Quintilian ähnlich angewandte Wort: aura popularis, Hauch der Volksgunst.— Suum cuique (Jedem das Seine) finden wir bei C i c e r o de offic. 1, 5; de natur. deor. 3, 15, 38; de leg. 1, 6, 19; (vrgl. Ta c i t u s: Annalen, 4, 35, P l i n i u s: Natur. hist. 14, 6, 8 und den ähnlichen Gedanken bei T h e o g n i s 332 u. 546). De finibus 5, 23, 67 sagt C i c e r o: Iustitia in suo cuique tribuendo cernitur, Die Gerechtigkeit erkennt man daran, dass sie Jedem das Seine zuerteilt; und suum cuique tribuere ist eine Rechtsregel U l p i a n s (Corp. iur. civ. Digest. I, 1 de iustitia et iure § 10); daher es in S h a k e s p e a r e s Andronicus 1, 2 heisst: Suum cuique spricht des Römers Recht. Friedrich I. von Preussen wählte das Suum cuique zur Inschrift vieler Medaillen und Münzen und zum Motto des am 17. Januar 1701 gestifteten Ordens vom schwarzen Adler, und seitdem blieb es Preussens Wahlspruch.— Das von C i c e r o de offic. 1, 10, 33 als abgedroschenes Sprichwort citierte Summum ius, summa iniuria Das höchste Recht (ist) das höchste Unrecht scheint eine spätere Fassung des Sprichwortes in des Te r e n z Heautontimorumenos 4, 5 zu sein:
  • 80. Dicunt: ius summum saepe summa est malitia. Man pflegt zu sagen: Das höchste Recht ist oft die höchste Bosheit. L u t h e r 21, 254 schreibt: Wie der Heide Terentius sagt: 'Das strengest Recht ist das allergrossest Unrecht'. (23, 295 führt Luther das Wort auf S c i p i o zurück.)— Aus C i c e r o s de offic. 1, 16, 52, wo es sich um allgemeine Gefälligkeiten gegen Jedermann handelt, wie z. B. dass wir es Jedem gestatten müssen, sich an unserem Feuer das seinige anzuzünden, citieren rauchende Gelehrte, um Feuer bittend: Ab igne ignem. Vom Feuer Feuer.— De offic. 1, 22, 77 enthält den von C i c e r o selbst verfertigten Vers: Cedant arma togae, concedat laurea laudi, Es mögen die Waffen der Toga, d. h. dem Friedensgewande, nachstehen, der Lorbeer der löblichen That, worüber er sich in der Rede in Pisonem 29 und 30 eines Weiteren auslässt, während er nur cedant arma togae in der 2. Philippica 8 schreibt.— Aus de offic. 1, 31, 110 kennen wir das schon hier von C i c e r o als Sprichwort citierte, in ad familiares 3, 1 und 12, 25 wieder vorkommende und von H o r a z in der Kunst zu dichten, 385, angewendete Invita Minerva; Wider den Willen der Minerva; aus de offic. 3, 1, 3: ex malis eligere minima; von zwei Übeln das kleinere wählen;
  • 81. minima de malis war nach 3, 29, 105 sprichwörtlich.— Aus C i c e r o s de offic. 3, 33, 117 (sed aqua haeret, ut aiunt) und aus ad Quintum fratrem 2, 8 (in hac causa mihi aqua haeret) stammt: Hic haeret aqua, Hier stockt es.— Aus C i c e r o de legibus 3, 3, 8 citieren viele: (his) salus populi suprema lex (esto), Für diese (nämlich für die Regierenden) sei das Wohl des Volkes das vornehmste Gebot.— In de finibus 2, 32, 105 führt C i c e r o als Sprichwort an: Iucundi acti labores; Angenehm (sind) die gethanen Arbeiten; und er fügt hinzu, auch E u r i p i d e s sage nicht übel: Suavis laborum est praeteritorum memoria, was in dessen Andromeda (nach Stobaeus: Florib. 29, 57) also lautete: Ἀλλ' ἡδύ τοι σωθέντα μεμνῆσθαι πόνων.— Aus C i c e r o s de natur. deor. 3, 40 citieren wir: Pro aris et focis (certamen); (Kampf) um Altar und häuslichen Herd.— In pro Milone 29, 79 sagt C i c e r o: Liberae sunt nostrae cogitationes (Frei sind unsere Gedanken), und L. 48 der Digesten 19, 18 heisst es aus U l p i a n s lib. III ad Edictum: Cogitationis poenam nemo patitur (Für seinen Gedanken wird niemand bestraft). Das ist umgewandelt worden zu dem sprichwörtlichen: Gedanken sind zollfrei,
  • 82. was sich wohl zuerst bei L u t h e r (Von weltlicher Oberkeit, wie man ihr Gehorsam schuldig sei. 1523) findet.— Aus C i c e r o s pro Sestio cap. 45 stammt: Otium cum dignitate, Musse mit Würde, oder, wie dort steht: cum dignitate otium. Der Sinn ist: behagliche Ruhe, verbunden mit einer angesehenen Stellung. Auch im Anfange der Schrift de oratore ist es zu finden und in Ciceros Briefen ad. famil. 1, 9, 21 wird es als ein häufig von ihm angewendetes Wort erwähnt.— In diesen Briefen C i c e r o s ad famil. 5, 12 steht: Epistola non erubescit, Ein Brief errötet nicht, häufig umgestellt in: Literae non erubescunt, auch in: Charta non erubescit.— Imperium et libertas[64] Herrschaft und Freiheit stammt aus C i c e r o s 4. Rede gegen Catilina, IX, 19, wo er dem Senat zuruft: Bedenket, wie in einer Nacht die so mühsam befestigte Herrschaft (quantis laboribus fundatum i m p e r i u m) und die so trefflich begründete Freiheit (quanta virtute stabilitam l i b e r t a t e m) fast zu Grunde ging! Die Rede schliesst mit der Forderung, dass der Senat über die Herrschaft und die Freiheit Italiens (de i m p e r i o, de l i b e r t a t e Italiae) die Entscheidung treffen möge.—
  • 83. [64] L o r d B e a c o n s f i e l d (Disraeli) sagte in einer Rede beim Lord-Mayors- Mahl am 10. Nov. 1879: Einer der grössten Römer wurde nach seiner Politik gefragt. Er antwortete: imperium et libertas. Die Nationalzeitung vom 28. Nov. 1879 (Morgen-Ausg.) teilte mit, dass auf ihre Anfrage bei dem Lord die Antwort erfolgt sei, die Quelle der citierten Worte fände sich im 1. Buche von B a c o n s Advancement of Learning. (Ausg. Spedding, Ellis und Heath, vol. III, p. 303.) Bacon übersetzt daselbst das in des Ta c i t u s Agricola 3 vorkommende principatum ac libertatem, wofür er imperium et libertatem schreibt, mit: government and liberty. Dass ein nach seiner Politik gefragter grosser Römer diese Aussage gethan habe, ist also ein Irrtum. Ut sementem feceris, ita metes Wie du gesäet, so wirst du ernten, dies Wort des M. Pinarius Rufus steht bei C i c e r o de oratore, 2, 65, 261. Ihm mochte des A r i s t o t e l e s Satz (Rhetor. 3, 3) vorschweben: σὺ δὲ ταῦτα αἰσχρῶς μὲν ἔσπειρας, κακῶς δὲ ἐθέρισας, was du hier böse gesäet, das hast du schlimm geerntet. (vrgl. in der Vulgata Hiob 4, 8: et seminant dolores et metunt eos, nach Luther: Die da Mühe pflügten und Unglück säeten, ernteten sie auch ein. Galater 6, 8: Quae enim seminaverit homo, haec et metet, nach Luther Gal. 6, 7: Denn was der Mensch säet, das wird er ernten, dann Sprüche Sal. 22, 8; 2. Cor. 9, 6 und Gefl. Worte a. d. Bibel Hosea 8, 7.)— Aus einigen Hexametern des Julius Cäsar (100-44 v. Chr.) über Terenz, die in dessen Biographie von S u e t o n (p. 294, 35, ed. Roth) enthalten sind, hat man vermittelst eines falsch gesetzten Kommas die Bezeichnung vis comica Kraft der Komik herausgelesen. Die betreffenden Verse heissen: Lenibus atque utinam scriptis adiuncta foret vis, Comica ut aequato virtus polleret honore Cum Graecis;
  • 84. Wenn sich doch Kraft dir zu deinem gefälligen Dichten gesellte, Dass dein Wort in der Komik die nämliche Geltung erreiche, Wie sie die Griechen besitzen! Es ist in ihnen daher von einer virtus comica, nicht aber von einer vis comica die Rede. (Klein. Schrift, in latein. u. deutscher Sprache von Fr. Aug. W o l f, herausg. von G. Bernhardy, II, p. 728). — Aus Lucretius (98-55 v. Chr.) Über die Natur ist 1, 102: Tantum religio potuit suadere malorum. Zu so verderblicher That vermochte der Glaube zu raten.— Aus 1, 149; 1, 205; 2, 287 wird citiert: De nihilo nihil, Aus Nichts wird Nichts, was P e r s i u s (Satiren 3, 84) wiederholt. L u c r e t i u s hatte seine Ansicht aus E p i k u r entlehnt, der (nach Diog. Laërtius 10, n. 24, 38) an die Spitze seiner Physik den Grundsatz stellte: οὐδὲν γίνεται ἐκ τοῦ μὴ ὄντος, Nichts wird aus dem Nichtseienden. Vor Epikur hatte schon M e l i s s u s gesagt, dass aus Nichtseiendem nichts werden kann (Ü b e r w e g Geschichte der Philosophie des Altertums, 1, S. 63), wie auch E m p e d o k l e s die Ansicht bekämpft, dass Etwas, was vorher nicht war, entstehen könne (ebenda 1, S. 66). A r i s t o t e l e s (Physik 1, 4) sagt, A n a x a g o r a s habe die übliche Ansicht der Philosophen für wahr gehalten, dass aus dem Nichtseienden Nichts entstünde (οὐ γινομένου οὐδενὸς ἐκ τοῦ μὴ ὄντος). In M a r k A u r e l s (121-180 n. Chr.) Selbstbetrachtungen 4, 4 heisst es: denn von Nichts kommt Nichts, so wenig als Etwas in das Nichts übergeht.— Aus 2, 1 und 1 ist berühmt:
  • 85. Suave, mari magno, turbantibus aequora ventis, E terra magnum alterius spectare laborem. Bei der gewaltigsten See, bei Wogen aufwühlenden Winden Anderer grosses Bemüh'n vom Land aus seh'n, ist behaglich. — Aus Sallusts (86-35 v. Chr.) Jugurtha 10 ist: concordia parvae res crescunt, discordia maximae dilabuntur. Durch Eintracht wächst das Kleine, durch Zwietracht zerfällt das Grösste.— Aus dem 187. Spruch des Publilius Syrus (bl. um 50 v. Chr.): Heredis fletus sub persona risus est, Das Weinen des Erben ist ein maskiertes Lachen, oder aus den sogenannten Varronischen Sentenzen (12): sic flet heres, ut puella nupta viro; utriusque fletus non apparens risus, Ein Erbe weint wie eine Braut; Beider Weinen ist heimliches Lachen (vrgl. auch Horaz Sat. 2, 5, 100- 104) scheint: Lachende Erben hervorgegangen zu sein. Schon 1622 kommt in Baden ein Lacherbengeld vor (vrgl. Rau: Grundsätze der Finanzwissenschaft, 5. Ausgabe 1864; § 237, S. 371 Anm. a) und Friedrich v o n L o g a u schreibt (Salomons von Golau Deutscher Sinn-Getichte Drey Tausend. Breslau. In Verlegung Caspar Klossmanns. 1654, jedoch ohne Jahresangabe erschienen. Zweite Zugabe zum 3. Tausend unter wehrendem Druck eingetroffen No. 78 u. 79):
  • 86. Lachende Erben. Wann Erben reicher Leute die Augen wässrig machen Sind solcher Leute Thränen nur Thränen von dem Lachen. * * * Die Römer brauchten Weiber, die weinten für das Geld; Obs nicht mit manchem Erben sich ebenso verhält? Dann heisst es in O t h o s Evangelischem Krankentrost (1664), S. 1034: Freu' dich, liebes Mütlein; traure, schwarzes Hütlein, heisst's bei lachenden Erben.— Die 245. Sentenz des P u b l i l i u s S y r u s: Inopi beneficium bis dat qui dat celeriter Dem Armen giebt eine doppelte Wohlthat, wer schnell giebt, wird verkürzt zu: Bis dat qui cito dat Doppelt giebt, wer gleich giebt.— Vergil (70 v.-19 n. Chr.) bietet Eclogen 1, 6, die manchmal als Hausinschrift verwendeten Worte des behaglich gelagerten Hirten Tityrus: Deus nobis haec otia fecit, Ein Gott hat uns diese Musse geschaffen. Ecl. 2, 1: Formosum pastor Corydon ardebat Alexin, Corydon glühte, der Hirt, für die schöne Gestalt des Alexis ist namentlich durch die verdrehte Übersetzung:
  • 87. Der Pastor Corydon briet einen wunderschönen Hering bekannt, die Christian W e i s e in seiner vom 27. Sept. 1692 datierten Vorrede zu Z i n c g r e f s Apophthegmata (Frankf. u. Leipz. 1693) erwähnt. Ecl. 2, 65 sagt Corydon von seiner Liebe: Trahit sua quemque voluptas. Jeden reisst seine Leidenschaft hin. Ecl. 3, 93 warnt Damoetas die Blumen und Erdbeeren pflückenden Knaben: Latet anguis in herba, Die Schlange lauert im Grase (vrgl. Georgica 4, 457-459).—Ecl. 3, 104 fordert Damoetas den Menalcas auf, ihm zu sagen, in welcher Gegend der Himmel nur drei Klafter breit sei, und, fügt er hinzu, wenn Du darauf antworten kannst, eris mihi magnus Apollo, dann wirst Du für mich gross wie Apoll sein. Danach pflegt man Fragen, deren Beantwortung man nicht erwartet, mit diesem Spruche zu begleiten.— Ecl. 3, 108 heisst es: Non nostrum tantas componere lites, Nicht unseres Amtes ist's, solchen Streit beizulegen; Ecl. 3, 111: Claudite iam rivos, pueri; sat prata biberant. Schliess't nun die Rinnen, ihr Knechte! genugsam getränkt sind die Wiesen.
  • 88. Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com