SlideShare a Scribd company logo
Nonparametric Tests For Complete Data Vilijandas
Bagdonavicius download
https://ebookbell.com/product/nonparametric-tests-for-complete-
data-vilijandas-bagdonavicius-4309000
Explore and download more ebooks at ebookbell.com
Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Nonparametric Tests For Censored Data Vilijandas Bagdonavicius
https://ebookbell.com/product/nonparametric-tests-for-censored-data-
vilijandas-bagdonavicius-4308998
Nonparametric Tuning Of Pid Controllers A Modified Relayfeedbacktest
Approach 1st Edition Igor Boiko Auth
https://ebookbell.com/product/nonparametric-tuning-of-pid-controllers-
a-modified-relayfeedbacktest-approach-1st-edition-igor-boiko-
auth-4324482
Theory Of Nonparametric Tests 1st Ed Thorsten Dickhaus
https://ebookbell.com/product/theory-of-nonparametric-tests-1st-ed-
thorsten-dickhaus-7149086
Nonparametric Statistical Tests A Computational Approach Markus
Neuhauser
https://ebookbell.com/product/nonparametric-statistical-tests-a-
computational-approach-markus-neuhauser-4393802
Nonparametric Monte Carlo Tests And Their Applications 1st Edition
Lixing Zhu Auth
https://ebookbell.com/product/nonparametric-monte-carlo-tests-and-
their-applications-1st-edition-lixing-zhu-auth-1291718
Statistical Tests Of Nonparametric Hypotheses Asymptotic Theory 1st
Edition Odile Pons
https://ebookbell.com/product/statistical-tests-of-nonparametric-
hypotheses-asymptotic-theory-1st-edition-odile-pons-5137748
Introduction To Statistics The Nonparametric Way Springer Texts In
Statistics Softcover Reprint Of The Original 1st Ed 1991 Noether
https://ebookbell.com/product/introduction-to-statistics-the-
nonparametric-way-springer-texts-in-statistics-softcover-reprint-of-
the-original-1st-ed-1991-noether-55472522
An Introduction To Nonparametric Statistics Chapman Hallcrc Texts In
Statistical Science 1st Edition John E Kolassa
https://ebookbell.com/product/an-introduction-to-nonparametric-
statistics-chapman-hallcrc-texts-in-statistical-science-1st-edition-
john-e-kolassa-51992562
Nonparametric Statistics For Applied Linguistics Research 1st Edition
Hassan Soleimani
https://ebookbell.com/product/nonparametric-statistics-for-applied-
linguistics-research-1st-edition-hassan-soleimani-33392562
Nonparametric Tests For Complete Data Vilijandas Bagdonavicius
Non-parametric Tests for Complete Data
Non-parametric Tests
for Complete Data
Vilijandas Bagdonavičius
Julius Kruopis
Mikhail S. Nikulin
First published 2011 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as
permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced,
stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers,
or in the case of reprographic reproduction in accordance with the terms and licenses issued by the
CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the
undermentioned address:
ISTE Ltd John Wiley & Sons, Inc.
27-37 St George’s Road 111 River Street
London SW19 4EU Hoboken, NJ 07030
UK USA
www.iste.co.uk www.wiley.com
© ISTE Ltd 2011
The rights of Vilijandas Bagdonaviçius, Julius Kruopis and Mikhail S. Nikulin to be identified as the
authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents
Act 1988.
Library of Congress Cataloging-in-Publication Data
Bagdonavicius, V. (Vilijandas)
Nonparametric tests for complete data / Vilijandas Bagdonavicius, Julius Kruopis, Mikhail Nikulin.
p. cm.
Includes bibliographical references and index.
ISBN 978-1-84821-269-5 (hardback)
1. Nonparametric statistics. 2. Statistical hypothesis testing. I. Kruopis, Julius. II. Nikulin, Mikhail
(Mikhail S.) III. Title.
QA278.8.B34 2010
519.5--dc22
2010038271
British Library Cataloguing-in-Publication Data
A CIP record for this book is available from the British Library
ISBN 978-1-84821-269-5
Printed and bound in Great Britain by CPI Antony Rowe, Chippenham and Eastbourne.
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Terms and Notation . . . . . . . . . . . . . . . . . . . . . xv
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . 1
. . . . . . . . . . . . . . . . 1
1.2. Examples of hypotheses in non-parametric
models . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1. Hypotheses on the probability distribution
of data elements . . . . . . . . . . . . . . . . . 2
1.2.2. Independence hypotheses . . . . . . . . . . . 4
1.2.3. Randomness hypothesis . . . . . . . . . . . . 4
1.2.4. Homogeneity hypotheses . . . . . . . . . . . . 4
1.2.5. Median value hypotheses . . . . . . . . . . . 5
1.3. Statistical tests . . . . . . . . . . . . . . . . . . . . 5
1.4. P-value . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5. Continuity correction . . . . . . . . . . . . . . . . 10
1.6. Asymptotic relative efficiency . . . . . . . . . . . 13
Chapter 2. Chi-squared Tests . . . . . . . . . . . . . . 17
2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 17
2.2. Pearson’s goodness-of-fit test: simple hypothesis 17
1.1. Statistical hypotheses
.
vi Non-parametric Tests for Complete Data
2.3. Pearson’s goodness-of-fit test: composite
hypothesis . . . . . . . . . . . . . . . . . . . . . . . 26
2.4. Modified chi-squared test for composite
hypotheses . . . . . . . . . . . . . . . . . . . . . . . 34
2.4.1. General case . . . . . . . . . . . . . . . . . . . . 35
2.4.2. Goodness-of-fit for exponential distributions 41
2.4.3. Goodness-of-fit for location-scale and
shape-scale families . . . . . . . . . . . . . . . 43
2.5. Chi-squared test for independence . . . . . . . . 52
2.6. Chi-squared test for homogeneity . . . . . . . . . 57
2.7. Bibliographic notes . . . . . . . . . . . . . . . . . . 64
2.8. Exercises . . . . . . . . . . . . . . . . . . . . . . . . 64
2.9. Answers . . . . . . . . . . . . . . . . . . . . . . . . . 72
Chapter 3. Goodness-of-fit Tests Based on
Empirical Processes . . . . . . . . . . . . . . . . . . . . . 77
3.1. Test statistics based on the empirical process . 77
3.2. Kolmogorov–Smirnov test . . . . . . . . . . . . . 82
3.3. ω2, Cramér–von-Mises and Andersen–Darling
tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.4. Modifications of Kolmogorov–Smirnov,
Cramér–von-Mises and Andersen–Darling
tests: composite hypotheses . . . . . . . . . . . . 91
3.5. Two-sample tests . . . . . . . . . . . . . . . . . . . 98
3.5.1. Two-sample Kolmogorov–Smirnov tests . . . 98
3.5.2. Two-sample Cramér–von-Mises test . . . . . 103
3.6. Bibliographic notes 104
3.7. Exercises . . . . . . . . . . . . . . . . . . . . . . . . 106
3.8. Answers 109
Chapter 4. Rank Tests . . . . . . . . . . . . . . . . . . . 111
4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 111
4.2. Ranks and their properties 112
4.3. Rank tests for independence . . . . . . . . . . . . 117
4.3.1. Spearman’s independence test . . . . . . . . . 117
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
Table of Contents vii
4.3.2. Kendall’s independence test . . . . . . . . . . 124
4.3.3. ARE of Kendall’s independence test with
respect to Pearson’s independence test
under normal distribution . . . . . . . . . . . 133
4.3.4. Normal scores independence test . . . . . . . 137
4.4. Randomness tests . . . . . . . . . . . . . . . . . . 139
4.4.1. Kendall’s and Spearman’s randomness tests 140
4.4.2. Bartels–Von Neuman randomness test . . . 143
4.5. Rank homogeneity tests for two independent
samples . . . . . . . . . . . . . . . . . . . . . . . . 146
4.5.1. Wilcoxon (Mann–Whitney–Wilcoxon) rank
sum test. . . . . . . . . . . . . . . . . . . . . . . 146
4.5.2. Power of the Wilcoxon rank sum test
against location alternatives . . . . . . . . . . 153
4.5.3. ARE of the Wilcoxon rank sum test with
respect to the asymptotic Student’s test . . 155
4.5.4. Van der Warden’s test . . . . . . . . . . . . . . 161
4.5.5. Rank homogeneity tests for two
independent samples under a scale
alternative . . . . . . . . . . . . . . . . . . . . . 163
4.6. Hypothesis on median value: the Wilcoxon
signed ranks test . . . . . . . . . . . . . . . . . . . 168
4.6.1. Wilcoxon’s signed ranks tests . . . . . . . . . 168
4.6.2. ARE of the Wilcoxon signed ranks test with
respect to Student’s test . . . . . . . . . . . . . 177
4.7. Wilcoxon’s signed ranks test for homogeneity of
two related samples . . . . . . . . . . . . . . . . . 180
4.8. Test for homogeneity of several independent
samples: Kruskal–Wallis test . . . . . . . . . . . 181
4.9. Homogeneity hypotheses for k related samples:
Friedman test . . . . . . . . . . . . . . . . . . . . . 191
4.10. Independence test based on Kendall’s
concordance coefficient . . . . . . . . . . . . . . . 204
4.11. Bibliographic notes . . . . . . . . . . . . . . . . . . 208
4.12. Exercises . . . . . . . . . . . . . . . . . . . . . . . . 209
4.13. Answers . . . . . . . . . . . . . . . . . . . . . . . . . 212
viii Non-parametric Tests for Complete Data
Chapter 5. Other Non-parametric Tests . . . . . . . 215
5.1. Sign test . . . . . . . . . . . . . . . . . . . . . . . . 215
5.1.1. Introduction: parametric sign test . . . . . . 215
5.1.2. Hypothesis on the nullity of the medians of
the differences of random vector components 218
5.1.3. Hypothesis on the median value . . . . . . . 220
5.2. Runs test . . . . . . . . . . . . . . . . . . . . . . . . 221
5.2.1. Runs test for randomness of a sequence of
two opposite events . . . . . . . . . . . . . . . 223
5.2.2. Runs test for randomness of a sample . . . . 226
5.2.3. Wald–Wolfowitz test for homogeneity of two
independent samples . . . . . . . . . . . . . . 228
5.3. McNemar’s test . . . . . . . . . . . . . . . . . . . . 231
5.4. Cochran test . . . . . . . . . . . . . . . . . . . . . . 238
5.5. Special goodness-of-fit tests . . . . . . . . . . . . 245
5.5.1. Normal distribution . . . . . . . . . . . . . . . 245
5.5.2. Exponential distribution . . . . . . . . . . . . 253
5.5.3. Weibull distribution . . . . . . . . . . . . . . . 260
5.5.4. Poisson distribution . . . . . . . . . . . . . . . 262
5.6. Bibliographic notes 268
5.7. Exercises . . . . . . . . . . . . . . . . . . . . . . . . 269
5.8. Answers . . . . . . . . . . . . . . . . . . . . . . . . . 271
APPENDICES 275
Appendix A. Parametric Maximum Likelihood
. . . . . . . . . . . 277
Appendix B. Notions from the Theory of
. . . . . . . . . . . . . . . . . . 281
B.1. Stochastic process . . . . . . . . . . . . . . . . . . 281
B.2. Examples of stochastic processes . . . . . . . . 282
B.2.1. Empirical process . . . . . . . . . . . . . . . 282
B.2.2. Gauss process . . . . . . . . . . . . . . . . . 283
B.2.3. Wiener process (Brownian motion) . . . . . 283
B.2.4. Brownian bridge . . . . . . . . . . . . . . . . 284
B.3. Weak convergence of stochastic processes . . . 285
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Estimators:
Stochastic Processes
Complete Samples
. . . . . . . . . . . . . . . . . .
Table of Contents ix
B.4. Weak invariance of empirical processes . . . . 286
B.5. Properties of Brownian motion and Brownian
bridge . . . . . . . . . . . . . . . . . . . . . . . . . 287
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . 293
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Preface
Testing hypotheses in non-parametric models are discussed
in this book. A statistical model is non-parametric if
it cannot be written in terms of a finite-dimensional
parameter. The main hypotheses tested in such models are
hypotheses on the probability distribution of elements of the
following: data homogeneity, randomness and independence
hypotheses. Tests for such hypotheses from complete
samples are considered in many books on non-parametric
statistics, including recent monographs by Maritz [MAR 95],
Hollander and Wolfe [HOL 99], Sprent and Smeeton [SPR 01],
Govindarajulu [GOV 07], Gibbons and Chakraborti [GIB 09]
and Corder and Foreman [COR 09].
This book contains tests from complete samples. Tests for
censored samples can be found in our book Tests for Censored
Samples [BAG 1 ].
In Chapter 1, the basic ideas of hypothesis testing and
general hypotheses on non-parametric models are briefly
described.
In the initial phase of the solution of any statistical problem
the analyst must choose a model for data analysis. The
correctness of the data analysis strongly depends on the choice
1
xii Non-parametric Tests for Complete Data
of an appropriate model. Goodness-of-fit tests are used to
check the adequacy of a model for real data.
One of the most-applied goodness-of-fit tests are chi-
squared type tests, which use grouped data. In many
books on statistical data analysis, chi-squared tests are
applied incorrectly. Classical chi-squared tests are based on
theoretical results which are obtained assuming that the
ends of grouping intervals do not depend on the sample,
and the parameters are estimated using grouped data. In
real applications, these assumptions are often forgotten.
The modified chi-squared tests considered in Chapter 2 do
not suffer from such drawbacks. They are based on the
assumption that the ends of grouping intervals depend on the
data, and the parameters are estimated using initially non-
grouped data.
Another class of goodness-of-fit tests based on functionals
of the difference of empirical and theoretical cumulative
distribution functions is described in Chapter 3. The tests
for composite hypotheses classical statistics are modified
by replacing unknown parameters by their estimators.
Application of these tests is often incorrect because the critical
values of the classical tests are used in testing the composite
hypothesis and applying modified statistics.
In section 5.5, special goodness-of-fit tests which are not
from the two above-mentioned classes, and which are specially
designed for specified probability distributions, are given.
Tests for the equality of probability distributions
(homogeneity tests) of two or more independent or dependent
random variables are considered in several chapters. Chi-
squared type tests are given in section 2.5 and tests based
on functionals of the difference of empirical distribution
functions are given in section 3.5. For many alternatives, the
Preface xiii
most efficient tests are the rank tests for homogeneity given
in sections 4.4 and 4.6–4.8.
Classical tests for the independence of random variables
are given in sections 2.4 (tests of chi-square type), and 4.3 and
5.2 (rank tests).
Tests for data randomness are given in sections 4.3 and 5.2.
All tests are described in the following way: 1) a hypothesis
is formulated; 2) the idea of test construction is given; 3) a
statistic on which a test is based is given; 4) a finite sample
and (or) asymptotic distribution of the test statistic is found;
5) a test, and often its modifications (continuity correction,
data with ex aequo, various approximations of asymptotic law)
are given; 6) practical examples of application of the tests are
given; and 7) at the end of the chapters problems with answers
are given.
Anyone who uses non-parametric methods of mathematical
statistics, or wants to know the ideas behind and
mathematical substantiation of the tests, can use this
book. It can be used as a textbook for a one-semester course
on non-parametric hypotheses testing.
Knowledge of probability and parametric statistics are
needed to follow the mathematical developments. The basic
facts on probability and parametric statistics used in the the
book are also given in the appendices.
The book consists of five chapters, and appendices. In each
chapter, the numbering of theorems, formulas, and comments
are given using the chapter number.
The book was written using lecture notes for graduate
students in Vilnius and Bordeaux universities.
xiv Non-parametric Tests for Complete Data
We thank our colleagues and students at Vilnius and
Bordeaux universities for comments on the content of this
book, especially Rūta Levulienė for writing the computer
programs needed for application of the tests and solutions of
all the exercises.
Vilijandas BAGDONAVIČIUS
Julius KRUOPIS
Mikhail NIKULIN
Terms and Notation
||A|| – the norm (
P
i
P
j a2
ij)1/2 of a matrix A = [aij];
A > B (A ≥ B) – the matrix A − B is positive (non-
negative) definite;
a ∨ b (a ∧ b) – the maximum (the minimum) of the numbers
a and b;
ASE – the asymptotic relative efficiency;
B(n, p) – binomial distribution with parameters n and p;
B−(n, p) – negative binomial distribution with parameters
n and p;
Be(γ, η) – beta distribution with parameters γ and η;
cdf – the cumulative distribution function;
CLT – the central limit theorem;
Cov(X, Y ) – the covariance of random variables X and Y ;
Cov(X, Y ) – the covariance matrix of random vectors X
and Y ;
xvi Non-parametric Tests for Complete Data
EX – the mean of a random variable X;
E(X) – the mean of a random vector X;
Eθ(X), E(X|θ), Varθ(X), Var(X|θ) – the mean or the
variance of a random variable X depending on the parameter
θ;
E(λ) – exponential distribution with parameters λ;
F(m, n) – Fisher distribution with m and n degrees of
freedom;
F(m, n; δ) – non-central Fisher distribution with m and n
degrees of freedom and non-centrality parameter δ;
Fα(m, n) – α critical value of Fisher distribution with m
and n degrees of freedom;
FT (x) (fT (x)) – the cdf (the pdf) of the random variable T;
f(x; θ), f(x|θ) – the pdf depending on a parameter θ;
F(x; θ), F(x|θ) – the cdf depending on a parameter θ;
G(λ, η) – gamma distribution with parameters λ and η;
iid – independent identically distributed;
LN(µ, σ) – lognormal distribution with parameters µ and
σ;
LS – least-squares (method, estimator);
ML – maximum likelihood (function, method, estimator);
N(0, 1) – standard normal distribution;
N(µ, σ2) – normal distribution with parameters µ and σ2;
Terms and Notation xvii
Nk(µ, Σ) – k-dimensional normal distribution with the
mean vector µ and the covariance matrix Σ;
P(λ) – Poisson distribution with a parameter λ;
pdf – the probability density function;
P{A} – the probability of an event A;
P{A|B} – the conditional probability of event A;
Pθ{A}, P{A|θ} – the probability depending on a parameter
θ;
Pk(n, π) – k-dimensional multinomial distribution with
parameters n and π = (π1, ..., πk)T , π1 + ... + πk = 1;
rv – random variable
S(n) – Student’s distribution with n degrees of freedom;
S(n; δ) – non-central Student’s distribution with n degrees
of freedom and non-centrality parameter δ;
tα(n) – α critical value of Student’s distribution with n
degrees of freedom;
U(α, β) – uniform distribution in the interval (α, β);
UMP – uniformly most powerful (test);
UUMP – unbiased uniformly most powerful (test);
VarX – the variance of a random variable X;
Var(X) – the covariance matrix of a random vector X;
W(θ, ν) – Weibull distribution with parameters θ ir ν;
X, Y, Z, ... – random variables;
xviii Non-parametric Tests for Complete Data
X, Y , Z, ... – random vectors;
XT
– the transposed vector, i.e. a vector-line;
||x|| – the length (xT x)1/2 = (
P
i x2
i )1/2 of a vector x =
(x1, ..., xk)T ;
X ∼ N(µ, σ2) – random variable X normally distributed
with parameters µ and σ2 (analogously in the case of other
distributions);
Xn
P
→ X – convergence in probability (n → ∞);
Xn
a.s.
→ X – almost sure convergence or convergence with
probability 1 (n → ∞);
Xn
d
→ X, Fn(x)
d
→ F(x) – weak convergence or convergence
in distribution (n → ∞);
Xn
d
→ X ∼ N(µ, σ2) – random variables Xn asymptotically
(n → ∞) normally distributed with parameters µ and σ2;
Xn ∼ Yn – random variables Xn and Yn asymptotically (n →
∞) equivalent (Xn − Yn
P
→ 0);
x(P) – P-th quantile;
xP – P-th critical value;
zα – α critical value of the standard normal distribution;
Σ = [σij]k×k – covariance matrix;
χ2(n) – chi-squared distribution with n degrees of freedom;
χ2(n; δ) – non-central chi-squared distribution with n
degrees of freedom and non-centrality parameter δ;
χ2
α(n) – α critical value of chi-squared distribution with n
degrees of freedom.
Chapter 1
Introduction
1.1. Statistical hypotheses
The simplest model of statistical data is a simple sample,
i.e. a vector X = (X1, ..., Xn)T of n independent identically
distributed random variables. In real experiments the values
xi of the random variables Xi are observed (measured). The
non-random vector x = (x1, ..., xn)T is a realization of the
simple sample X.
In more complicated experiments the elements Xi are
dependent, or not identically distributed, or are themselves
random vectors. The random vector X is then called a sample,
not a simple sample.
Suppose that the cumulative distribution function (cdf) F of
a sample X (or of any element Xi of a simple sample) belongs
to a set F of cumulative distribution functions. For example,
if the sample is simple then F may be the set of absolutely
continuous, discrete, symmetric, normal, Poisson cumulative
distribution functions. The set F defines a statistical model.
Suppose that F0 is a subset of F.
2 Non-parametric Tests for Complete Data
The statistical hypothesis H0 is the following assertion: the
cumulative distribution function F belongs to the set F0. We
write H0 : F ∈ F0.
The hypothesis H1 : F ∈ F1, where F1 = FF0 is the
complement of F0 to F is called alternative to the hypothesis
H0.
If F = {Fθ, θ ∈ Θ ⊂ Rm} is defined by a finite-dimensional
parameter θ then the model is parametric. In this case the
statistical hypothesis is a statement on the values of the finite-
dimensional parameter θ.
In this book non-parametric models are considered. A
statistical model F is called non-parametric if F is not defined
by a finite-dimensional parameter.
If the set F0 contains only one element of the set F then the
hypothesis is simple, otherwise the hypothesis is composite.
1.2. Examples of hypotheses in non-parametric models
Let us look briefly and informally at examples of the
hypotheses which will be considered in the book. We do not
formulate concrete alternatives, only suppose that models are
non-parametric. Concrete alternatives will be formulated in
the chapters on specified hypotheses.
1.2.1. Hypotheses on the probability distribution of data
elements
The first class of hypotheses considered in this book
consists of hypotheses on the form of the cdf F of the elements
of a sample.
Such hypotheses may be simple or composite.
Introduction 3
A simple hypothesis has the form H0 : F = F0; here F0
is a specified cdf. For example, such a hypothesis may mean
that the n numbers generated by a computer are realizations
of random variables having uniform U(0, 1), Poisson P(2),
normal N(0, 1) or other distributions.
A composite hypothesis has the form H0 : F ∈ F0 = {Fθ, θ ∈
Θ}, where Fθ are cdfs of known analytical form depending
on the finite-dimensional parameter θ ∈ Θ. For example, this
may mean that the salary of the doctors in a city are normally
distributed, or the failure times of TV sets produced by a
factory have the Weibull distribution.
More general composite hypotheses, meaning that the data
verify some parametric or semi-parametric regression model,
may be considered. For example, in investigating the influence
of some factor z on the survival time the following hypothesis
on the cdf Fi of the i-th sample element may be used:
Fi(x) = 1 − {1 − F0(x)}exp{βzi}
, i = 1, . . . , n
where F0 is an unknown baseline cdf, β is an unknown scalar
parameter and zi is a known value of the factor for the i-th
sample element.
The following tests for simple hypotheses are considered:
chi-squared tests (section 2.2) and tests based on the difference
of empirical and cumulative distribution functions (sections
3.2 and 3.3).
The following tests for composite hypotheses are
considered: general tests such as chi-squared tests (sections
2.3 and 2.4, tests based on the difference of non-parametric and
parametric estimators of the cumulative distribution function
(section 3.4), and also special tests for specified families of
probability distributions (section 5.5).
4 Non-parametric Tests for Complete Data
1.2.2. Independence hypotheses
Suppose that (Xi, Yi)T , i = 1, 2...., n is a simple sample of
the random vector (X, Y )T with the cdf F = F(x, y) ∈ F; here
F is a non-parametric class two-dimensional cdf.
An independence hypothesis means that the components
X and Y are independent. For example, this hypothesis may
mean that the sum of sales of managers X and the number
of complaints from consumers Y are independent random
variables.
The following tests for independence of random variables
are considered: chi-squared independence tests (section 2.5)
and rank tests (sections 4.3 and 4.10).
1.2.3. Randomness hypothesis
A randomness hypothesis means that the observed vector
x = (x1, ..., xn)T is a realization of a simple sample X =
(X1, ..., Xn)T , i.e. of a random vector with independent and
identically distributed (iid) components.
The following tests for randomness hypotheses are
considered: runs tests (section 5.2) and rank tests (section 4.4).
1.2.4. Homogeneity hypotheses
A homogeneity hypothesis of two independent simple
samples X = (X1, ..., Xm)T and Y = (Y1, ..., Yn)T means that
the cdfs F1 and F2 of the random variables Xi and Yj coincide.
The homogeneity hypothesis of k > 2 independent samples is
formulated analogously.
The following tests for homogeneity of independent simple
samples are considered: chi-squared tests (section 2.6), tests
Introduction 5
based on the difference of cumulative distribution functions
(section 3.5), rank tests (sections 4.5 and 4.8), and some special
tests (section 5.1).
If n independent random vectors Xi = (Xi1, ..., Xik)T , i =
1, ..., n are observed then the vectors (X1j, ..., Xnj)T composed
of the components are k dependent samples, j = 1, ..., k.
The homogeneity hypotheses of k related samples means the
equality of the cdfs F1, ..., Fk of the components Xi1, ..., Xik.
The following tests for homogeneity of related samples are
considered: rank tests (sections 4.7 and 4.9) and other special
tests (sections 5.1, 5.3 and 5.4).
1.2.5. Median value hypotheses
Suppose that X = (X1, ..., Xn)T is a simple sample of a
continuous random variable X. Denote by M the median of
the random variable X. The median value hypothesis has the
form H : M = M0; here M0 is a specified value of the median.
The following tests for this hypothesis are considered: sign
tests (section 5.1) and rank tests (section 4.6).
1.3. Statistical tests
A statistical test or simply a test is a rule which enables a
decision to be made on whether or not the zero hypothesis H0
should be rejected on the basis of the observed realization of
the sample.
Any test considered in this book is based on the values
of some statistic T = T(X) = T(X1, ..., Xn), called the test
statistic. Usually the statistic T takes different values under
the hypothesis H0 and the alternative H1. If the statistic
T has a tendency to take smaller (greater) values under
6 Non-parametric Tests for Complete Data
the hypothesis H0 than under the alternative H1 then the
hypothesis H0 is rejected in favor of the alternative if T > c
(T < c, respectively), where c is a well-chosen real number.
If the values of the statistic T have a tendency to
concentrate in some interval under the hypothesis and outside
this interval under the alternative then the hypothesis H0 is
rejected in favor of the alternative if T < c1 or T > c2, where c1
and c2 are well-chosen real numbers.
Suppose that the hypothesis H0 is rejected if T > c (the
other two cases are considered similarly).
The probability
β(F) = PF {T > c}
of rejecting the hypothesis H0 when the true cumulative
distribution function is a specified function F ∈ F is called
the power function of the test. When using a test, two types of
error are possible:
1. The hypothesis H0 is rejected when it is true, i.e. when
F ∈ F0. Such an error is called a type I error. The probability
of this error is β(F), F ∈ F0.
2. The hypothesis H0 is not rejected when it is false, i.e.
when F ∈ F1. Such an error is called a type II error. The
probability of this error is 1 − β(F), F ∈ F1.
The number
supF∈F0
β(F) [1.1]
is called the significance level of the test .
Fix α ∈ (0, 1). If the significance level does not exceed α
then for any F ∈ F0 the type I error does not exceed α.
Introduction 7
Usually tests with significance level values not greater than
α = 0.1; 0.05; 0.01 are used.
If the distribution of the statistic T is absolutely continuous
then, usually, for any α ∈ (0, 1) we can find a test based on this
statistic such that the significance level is equal to α.
A test with a significance level not greater than α is called
unbiased, if
inf F∈F1
β(F) ≥ α [1.2]
This means that the zero hypothesis is rejected with greater
probability under any specified alternative than under the
zero hypothesis. Let T be a class of test statistics of unbiased
tests with a significance level not greater than α.
The statistic T defines the uniformly most powerful
unbiased test in the class T if βT (F) ≥ βT∗ (F) for all T∗ ∈ T
and for all F ∈ F1.
A test is called consistent if for all F ∈ F1
β(F) → 1, as n → ∞ [1.3]
This means that if n is large then under any specified
alternative the probability of rejecting the zero hypothesis is
near to 1.
1.4. P-value
Suppose that a simple statistical hypothesis H0 is rejected
using tests of one of the following forms:
1) T ≥ c; 2) T ≤ c; or 3) T ≤ c1 or T ≥ c2; here T = T(X) is
a test statistic based on the sample X = (X1, . . . , Xn)T .
We write P0{A} = P{A|H0}.
8 Non-parametric Tests for Complete Data
Fix α ∈ (0, 1). The first (second) test has a significance level
not greater than α, and nearest to α if the constant c = inf{s :
P0{T ≥ s} ≤ α} (c = sup{s : P0{T ≤ s} ≤ α}).
The third test has a significance level not greater than α if
c1 = sup{s : P{T ≤ s} ≤ α/2} and c2 = inf{s : P{T ≥ s} ≤
α/2}.
Denote by t the observed value of the statistic T. In the case
of the tests of the first two forms the P-values are defined as
the probabilities
pv = P0{T ≥ t} and pv = P0{T ≤ t}.
Thus the P-value is the probability that under the zero
hypothesis H0 the statistic T takes a value more distant than t
in the direction of the alternative (in the first case to the right,
in the second case to the left, from t).
In the third case, if
P0{T ≤ t} ≤ P0{T ≥ t}
then
pv/2 = P0{T ≤ t}
and if
P0{T ≤ t} ≥ P0{T ≥ t}
then
pv/2 = P0{T ≥ t}
So in the third case the P-value is defined as follows
pv = 2 min{P0{T ≤ t}, P0{T ≥ t}} = 2 min{FT (t), 1 − FT (t−)}
where FT is the cdf of the statistic T under the zero hypothesis
H0. If the distribution of T is absolutely continuous and
symmetric with respect to the origin, the last formula implies
pv = 2 min{FT (t), FT (−t)} = 2FT (− | t |) = 2{1 − FT (| t |)}
Introduction 9
If the result observed during the experiment is a rare event
when the zero hypothesis is true then the P-value is small
and the hypothesis should be rejected. This is confirmed by
the following theorem.
Theorem 1.1. Suppose that the test is of any of the three forms
considered above. For the experiment with the value t of the
statistic T the inequality pv ≤ α is equivalent to the rejection of
the zero hypothesis.
Proof. Let us consider an experiment where T = t. If the test
is defined by the inequality T ≥ c (T ≤ c) then c = inf{s :
P0{T ≥ s} ≤ α} (c = sup{s : P0{T ≤ s} ≤ α}) and P0{T ≤ t} =
pv (P0{T ≥ t} = pv). So the inequality pv ≤ α is equivalent
to the inequality t ≥ c (t ≤ c). The last inequalities mean that
the hypothesis is rejected.
If the test is defined by the inequalities T ≤ c1 or T ≥ c2
then c1 = sup{s : P0{T ≤ s} ≤ α/2}, c2 = inf{s : P0{T ≥ s} ≤
α/2} and 2 min{P0{T ≤ t}, P0{T ≥ t}} = pv. So the inequality
pv ≤ α means that 2 min{P0{T ≤ t, P0{T ≥ t}} ≤ α.
If P0{T ≤ t} ≥ P0{T ≥ t}, then the inequality pv ≤ α means
that P0{T ≥ t} ≤ α/2. This is equivalent to the inequality t ≥
c2, which means that the hypothesis is rejected. Analogously, if
P0{T ≤ t} ≥ P0{T ≥ t} then the inequality pv ≤ α means that
P0{T ≤ t} ≤ α/2. This is equivalent to the inequality t ≤ c1,
which means that the hypothesis is rejected. So in both cases
the inequality pv ≤ α means that the hypothesis is rejected.
△
If the critical region is defined by the asymptotic
distribution of T (usually normal or chi-squared) then the P-
value pva is computed using the asymptotic distribution of T,
and it is called the asymptotic P-value.
10 Non-parametric Tests for Complete Data
Sometimes the P-value pv is interpreted as random because
each value t of T defines a specific value of pv. In the case
of the alternatives considered above the P-values are the
realizations of the following random variables:
1 − FT (T−), FT (T) and 2 min{FT (T), 1 − FT (T−)}
1.5. Continuity correction
If the distribution of the statistic T is discrete and
the asymptotic distribution of this statistic is absolutely
continuous (usually normal) then for medium-sized samples
the approximation of distribution T can be improved using the
continuity correction [YAT 34].
The idea of a continuity correction is explained by the
following example.
Example 1.1. Let us consider the parametric hypothesis: H :
p = 0.5 and the alternative H1 : p > 0.5; here p is the Bernoulli
distribution parameter. For example, suppose that during n =
20 Bernoulli trials the number of successes is T = 13. It is
evident that the hypothesis H0 is rejected if the statistic T
takes large values, i.e. if T ≥ c for a given c. Under H0, the
statistic T has the binomial distribution B(20; 0.5). The exact
P-value is
pv = P{T ≥ 13} =
20
X
i=13
Ci
20(1/2)20
= I1/2(13, 8) = 0.131588
Using the normal approximation
Zn = (T − 0.5n)/
√
0.25n = (T − 10)/
√
5
d
→ Z ∼ N(0, 1)
we obtain the asymptotic P-value
pva = P{T ≥ 13} = P{
T − 10
√
5
≥
13 − 10
√
5
} ≈
Introduction 11
1 − Φ(
13 − 10
√
5
) = 0.089856
This is considerably smaller than the exact P-value.
Note that P{T ≥ 13} = P{T > 12}. If we use the
normal approximation then the same probability may be
approximated by:
1 − Φ((13 − 10)/
√
5) = 0.089856
or by 1 − Φ((12 − 10)/
√
5) = 0.185547
Both approximations are far from the exact P-value. The
continuity correction is therefore performed using the normal
approximation in the center 12.5 of the interval (12, 13]. So the
asymptotic P-value with a continuity correction is
pvcc = 1 − Φ((13 − 0.5 − 10)/
√
5) = 0.131776
The obtained value is very similar to the exact P-value.
In the case of the alternative H2 : p < 0.5 the zero
hypothesis is rejected if T ≤ d, and the P-value is
pv = P{T ≤ 13} =
13
X
i=0
Ci
20(1/2)2
0 = I1/2(7, 14) = 0.942341
In this case
pva = Φ((13 − 10)/
√
5) = 0.910144
Note that P{T ≤ 13} = P{T < 14}. If we use the normal
approximation, the same probability is approximated by
Φ((13 − 10)/
√
5) = 0.910144 or Φ((14 − 10)/
√
5) = 0.963181
Both are far from the exact P-value. So the continuity
correction is performed using the normal approximation in the
12 Non-parametric Tests for Complete Data
middle 13.5 of the interval (13, 14]. Therefore the asymptotic
P-value with a continuity correction is
pvcc = Φ((13 + 0.5 − 10)/
√
5) = 0.941238
The obtained value is very similar to the exact P-value.
In the case of the bilateral alternative H3 : p 6= 0.5 the exact
and asymptotic P-values are
pv = 2 min{FT (13), 1 − FT (13−)}
= 2 min(0.942341; 0.131588) = 0.263176
and pva = 2 min(0.910144; 0.089856) = 0.179712, respectively.
The asymptotic P-value with a continuity correction is
pvcc = 2 min(Φ((13 + 0.5 − 10)/
√
5), 1 − Φ((13 − 0.5 − 10)/
√
5)) =
2 min(0.941238, 0.131776) = 0.263452
Generalizing, suppose that the test statistic T takes
integer values and under the zero hypothesis the asymptotic
distribution of the statistic
Z =
T − ET
√
VarT
is standard normal. If the critical region is defined by the
inequalities
a) T ≥ c; b) T ≤ c; c) T ≤ c1 or T ≥ c2;
and the observed value of the statistic T is t then the
asymptotic P-values with a continuity correction are
pvcc = 1 − Φ((t − 0.5 − ET)/
√
VarT)
pvcc = Φ((t + 0.5 − ET)/
√
VarT)
pvcc = 2 min

Φ

t + 0.5 − ET
√
VarT

, 1 − Φ

t − 0.5 − ET
√
VarT

[1.4]
respectively.
Introduction 13
1.6. Asymptotic relative efficiency
Suppose that under the zero hypothesis and under the
alternative the distribution of the data belongs to a non-
parametric family depending on a scalar parameter θ and
possibly another parameter ϑ. Let us consider the hypothesis
H0 : θ = θ0 with the one-sided alternatives H1 : θ  θ0 or
H2 : θ  θ0 and the two-sided alternative H3 : θ 6= θ0.
Example 1.2. Let X = (X1, ..., Xn)T and Y = (Y1, ..., Ym)T
be two independent simple samples, Xi ∼ F(x) and Yj ∼
F(x − θ), where F(x) is an unknown absolutely continuous
cdf (the parameter ϑ) and θ is a location parameter. Under
the homogeneity hypothesis θ = 0, under the one-sided
alternatives θ  0 or θ  0, and under the two-sided
alternative θ 6= 0. So the homogeneity hypothesis can be
formulated in terms of the scalar parameter: H0 : θ = 0. The
possible alternatives are H1 : θ  0, H2 : θ  0, H3 : θ 6= 0.
Let us consider the one-sided alternative H1. Fix α ∈ (0, 1).
Suppose that the hypothesis is rejected if
Tn  cn,α
where n is the sample size and Tn is the test statistic.
Denote by
βn(θ) = Pθ(Tn  cn,α}
the power function of the test.
Most of the tests are consistent, so the power of such tests
is close to unity if the sample size is large. So the limit of the
power under fixed alternatives is not suitable for comparing
the performance of different tests.
14 Non-parametric Tests for Complete Data
To compare the tests the behavior of the powers of these
tests under the sequence of approaching alternatives
Hn : θ = θn = θ0 +
h
nδ
, δ  0, h  0,
may be compared.
Suppose that the same sequence of approaching
alternatives is written in two ways
θm = θ0 +
h1
nδ
1m
= θ0 +
h2
nδ
2m
where nim → ∞ as m → ∞, and
lim
m→∞
βn1m (θm) = lim
m→∞
βn2m (θm)
Then the limit (if it exists and does not depend on the choice
of θm)
e(T1n, T2n) = lim
m→∞
n2m
n1m
is called the asymptotic relative efficiency (ARE) [PIT 48] of the
first test with respect to the second test.
The ARE is the inverse ratio of sample sizes necessary to
obtain the same power for two tests with the same asymptotic
significance level, while simultaneously the sample sizes
approach infinity and the sequence of alternatives approaches
θ0.
Under regularity conditions, the ARE have a simple
expression.
Regularity assumptions:
1) Pθ0 {Tin ≥ cn,α} → α.
Introduction 15
2) In the neighborhood of θ0 there exist
µin(θ) = EθTin, σin(θ) = VarθTin
and the function µin(θ) is infinitely differentiable at the point
θ0; furthermore, µ̇in(θ0)  0, and higher order derivatives are
equal to 0; i = 1, 2.
3) There exists
lim
n→∞
µin(θ) = µi(θ), lim
n→∞
nδ
σin(θ) = σi(θ), µi(θ0)/σi(θ0)  0
where δ  0.
4) For any h ≥ 0
µ̇in(θn) → µ̇i(θ0), σin(θn) → σi(θ0), as n → ∞
5) Test statistics are asymptotically normal:
Pθn {(Tin − µin(θn))/σin(θn) ≤ z} → Φ(z).
Theorem 1.2. If the regularity assumptions are satisfied then
the asymptotic relative efficiency can be written in the form
e(T1n, T2n) =

µ̇1(θ0)/σ1(θ0)
µ̇2(θ0)/σ2(θ0)
1/δ
[1.5]
Proof. First let us consider one statistic and skip the indices
i. Let us find limn→∞ βn(θn). By assumption 1
Pθ0 {Tn  cn,α} = Pθ0 {
Tn − µn(θ0)
σn(θ0)

cn,α − µn(θ0)
σn(θ0)
} → α
so
zn,α =
cn,α − µn(θ0)
σn(θ0)
→ zα
16 Non-parametric Tests for Complete Data
By assumptions 2–4
µn(θn) − µn(θ0)
σn(θ0)
=
µ̇n(θ0)hn−δ + o(1)
n−δσ(θ0) + o(1)
→
µ̇(θ0)
σ(θ0)
h
So using assumption 5 we have
βn(θn) = Pθn {Tn  cn,α} = Pθn {
Tn − µn(θn)
σn(θn)

cn,α − µn(θn)
σn(θn)
} =
Pθn {
Tn − µn(θn)
σn(θn)
 zn,α
σn(θ0)
σn(θn)
−
µn(θn) − µn(θ0)
σn(θ0)
σn(θ0)
σn(θn)
}
→ 1 − Φ

zα − h
µ̇(θ0)
σ(θ0)

Let T1n and T2n be two test statistics verifying the
assumptions of the theorem and let
θm = θ0 +
h1
nδ
1m
= θ0 +
h2
nδ
2m
be a sequence of approaching alternatives. The last equalities
imply
h2
h1
= (
n2m
n1m
)δ
We have proved that
βnim (θm) → 1 − Φ

zα − hi
µ̇i(θ0)
σi(θ0)

Choose n1m and n2m to give the same limit powers. Then
n2m
n1m
= (
h2
h1
)1/δ
=

µ̇1(θ0)/σ1(θ0)
µ̇2(θ0)/σ2(θ0)
1/δ
△
Chapter 2
Chi-squared Tests
2.1. Introduction
Chi-squared tests are used when data are classified into
several groups and only numbers of objects belonging to
concrete groups are used for test construction. The vector of
such random numbers has a multinomial distribution and
depends only on the finite number of parameters. So chi-
squared tests, being based on this vector, are parametric but
are also used for non-parametric hypotheses testing, so we
include them in this book.
When the initial data are replaced by grouped data, some
information is lost, so this method is used when more powerful
tests using all the data are not available.
2.2. Pearson’s goodness-of-fit test: simple hypothesis
Suppose that X = (X1, ..., Xn)T is a simple sample of a
random variable X having the c.d.f. F from a non-parametric
class F.
18 Non-parametric Tests for Complete Data
Simple hypothesis
H0 : F(x) = F0(x), ∀x ∈ R [2.1]
where F0 is completely specified (known) cdf from the family
F.
The hypotheses
H0 : X ∼ U(0, 1), H0 : X ∼ B(1, 0.5), H0 : X ∼ N(0, 1)
are examples of simple non-parametric hypotheses. For
example, such a hypothesis is verified if we want to know
whether realizations generated by a computer are obtained
from the uniform U(0, 1), Poisson P(2), normal N(0, 1) or
other completely specified distribution.
The data are grouped in the following way: the abscissas
axis is divided into a finite number of intervals using the
points −∞ = a0  a1  ...  ak = ∞. Denote by Uj the number
of Xi falling in the interval (aj−1, aj]
Uj =
n
X
i=1
1(aj−1,aj](Xi), j = 1, 2...., k
So, instead of the fully informative data X, we use the grouped
data
U = (U1, . . . , Uk)T
We can also say that the statistic U is obtained using a
special data censoring mechanism, known as the mechanism
of grouping data. The random vector U has the multinomial
distribution Pk(n, π): for 0 ≤ mi ≤ n,
P
i mi = n
P{U1 = m1, ..., Uk = mk} =
n!
m1!...mk!
πm1
1 ...πmk
k [2.2]
where πi = P{X ∈ (ai−1, ai]} = F(ai) − F(ai−1) is the
probability that the random variable X takes a value in the
interval (ai−1, ai], π = (π1, . . . , πk)T , π1 + ... + πk = 1.
Chi-squared Tests 19
Under the hypothesis H0, the following hypothesis also
holds.
Hypothesis on the values of multinomial distribution
parameters
H′
0 : πj = πj0, j = 1, 2...., k [2.3]
where πj0 = F0(aj) − F0(aj−1).
Under the hypothesis H′
0
U ∼ Pk(n, π0)
where π0 = (π10, . . . , πk0)T , π10 + ... + πk0 = 1.
If the hypothesis H′
0 is rejected then it is natural also to
reject the narrower hypothesis H0.
Pearson’s chi-squared test for the hypothesis H′
0 is based on
the differences between the maximum likelihood estimators π̂j
of the probabilities πj obtained from the grouped data U and
the hypothetical values πj0 of these probabilities.
The relation
Pk
j=1 πj = 1 implies that the multinomial
model Pk(1, π) depends on (k − 1)-dimensional parameters
(π1, ..., πk−1)T .
From [2.2] the likelihood function of the random vector U
is
L(π1, ..., πk−1) =
n!
U1!...Uk!
πU1
1 ...πUk
k [2.4]
The loglikelihood function is
ℓ(π1, ..., πk−1) =
k−1
X
j=1
Uj ln πj + Uk ln(1 −
k−1
X
j=1
πj + C)
20 Non-parametric Tests for Complete Data
so
ℓ̇j =
Uj
πj
−
Uk
1 −
Pk−1
j=1 πj
=
Uj
πj
−
Uk
πk
which implies that for all j, l = 1, . . . k
Ujπl = Ulπj
Summing both sides with respect to l and using the relations
Pk
j=1 πj = 1 and
Pk
j=1 Uj = n, we obtain Uj = nπj, so the
maximum likelihood estimators of the parameters πi are
π̂ = (π̂1, ..., π̂k)T
, π̂i = Ui/n, i = 1, ..., k
The famous Pearson’s statistic has the form
X2
n =
k
X
i=1
(
√
n(π̂i − πi0))2
πi0
=
k
X
i=1
(Ui − nπi0)2
nπi0
=
1
n
k
X
i=1
U2
i
πi0
− n.
[2.5]
Under the hypothesis H′
0, the realizations of the differences
π̂i − πi0 are scattered around zero. If the hypothesis is not true
then at least one number i exists such that the realizations of
the differences π̂i − πi0 are scattered around some positive or
negative value. In such a case the statistic X2
n has a tendency
to take greater values than under the zero hypothesis.
Pearson’s test is asymptotic, i.e. it uses an asymptotic
distribution of the test statistic X2
n, which is chi square, which
follows from the following theorem.
Theorem 2.1. If 0  πi0  1, π10 + · · · + πk0 = 1 then under the
hypothesis H′
0
X2
n
d
→ χ2
k−1 as n → ∞
Proof. Under the hypothesis H′
0, the random vector U =
(U1, ..., Uk)T is the sum of iid random vectors Xi having the
mean π0 and the covariance matrix D = [djj′ ]k×k; djj =
πj0(1 − πj0), djj′ = −πj0πj′0, j 6= j′.
Chi-squared Tests 21
If 0  πi0  1, π10 + · · · + πk0 = 1 then the central limit
theorem holds
√
n(π̂ − π0) =
√
n(π̂1 − π10, . . . , π̂k − πk0)T d
→ Y ∼ Nk(0, D)
[2.6]
as n → ∞. The matrix D can be written in the form
D = p0 − p0pT
0
where p0 is the diagonal matrix with elements π10, ..., πk0 on
the main diagonal. Set
Zn =
√
np
−1/2
0 (π̂ − π0) =
√
n(π̂1 − π10)
√
π10
, ...,
√
n(π̂k − πk0)
√
πk0
T
The result [2.6] implies that
Zn
d
→ Z ∼ Nk(0, Σ)
where
Σ = p
−1/2
0 Dp
−1/2
0 = Ek − qqT
where q = (
√
π10, ...,
√
πk0)T , qT q = 1, and Ek is a k × k unit
matrix.
By the well known theorem on the distribution of quadratic
forms [RAO 02] the limit distribution of the statistic ZT
n Σ−Zn
is chi-squared with Tr(Σ−Σ) degrees of freedom, where Σ− is
the generalized inverse of Σ.
Note that
Σ−
= Ek + qqT
, Tr(Σ−
Σ) = Tr(Ek − qqT
) = k − 1.
So, using the equality ZT
n q = 0, we have
X2
n = ZT
n Σ−
Zn = ||Zn||2 d
→ ||Z||2
∼ χ2
(k − 1) [2.7]
△
22 Non-parametric Tests for Complete Data
The theorem implies the following.
Pearson’s chi-squared test: the hypothesis H′
0 is rejected
with an asymptotic significance level α if
X2
n  χ2
α(k − 1) [2.8]
The hypothesis H′
0 can also be verified using the equivalent
likelihood ratio test, based on the statistic
Λn =
supπ=π0
L(π)
supπ L(π)
=
L(π0)
L(π̂)
= nn
k
Y
i=1

πi0
Ui
Ui
Under the hypothesis H′
0 (see Appendix A, comment A3)
Rn = −2 ln Λn = 2
k
X
i=1
Ui ln
Ui
nπi0
d
→ V ∼ χ2
(k − 1), as n → ∞
[2.9]
So the statistics Rn and X2
n are asymptotically equivalent.
Likelihood ratio test: hypothesis H′
0 is rejected with an
asymptotic significance level α if
Rn  χ2
α(k − 1) [2.10]
Comment 2.1. Both given tests are not exact, they are
used when the sample size n is large. The accuracy of the
conclusions depends on the accuracy of the approximations
[2.7, 2.9]. If the number of intervals is too large then Ui tend
to take values 0 or 1 for all i and the approximation is poor. So
the number of intervals k must not be too large.
Rule of thumb: choose the grouping intervals to obtain
nπi0 ≥ 5.
Chi-squared Tests 23
Comment 2.2. If the accuracy of the approximation of the
distribution of the statistic X2
n (or Rn) is suspected not to be
sufficient then the P-value can be computed by simulation.
Suppose that the realization of the statistic X2
n is x2
n. N
values of the random vector U ∼ P(n, π0) are simulated and
for each value of U the corresponding value of the statistics
X2
n can be computed. Suppose that M is the number of
values greater then x2
n. Then the P-value is approximated by
M/N. The hypothesis H′
0 is rejected with an approximated
significance level α if M/N  α. The accuracy of this test
depends on the number of simulations N.
Comment 2.3. If the hypothesis H′
0 is rejected then the
hypothesis H0 is also rejected because it is narrower. If the
data do not contradict the hypothesis H′
0 then we have no
reason to reject the hypothesis H0.
Comment 2.4. If the distribution of the random variable X is
discreet and concentrated at the points x1, ..., xk then grouping
is not needed in such a case. Ui is the observed number of the
value xi.
Comment 2.5. As was noted, the hypotheses H0 and H′
0
are not equivalent in general. Hypothesis H′
0 states that the
increment of the cumulative distribution function in the j-th
interval is πj0 but the behavior of the cumulative distribution
function inside the interval is not specified. If n is large then
the number of grouping intervals can be increased and thus
the hypotheses become closer.
Comment 2.6. If the hypothesis H′
0 does not hold and U ∼
Pk(n, π) then the distributions of the statistics Rn and X2
n are
approximately non-central chi-squared with k − 1 degrees of
24 Non-parametric Tests for Complete Data
freedom and the parameter of non-centrality
∆ = 2n
k
X
j=1
πj ln
πj
πj0
≈ δ = n
k
X
i=1
(πj − πj0)2
πj0
. [2.11]
Example 2.1. Random number generator generated n = 80
random numbers. Ordered values are given in the following
table:
0.0100 0.0150 0.0155 0.0310 0.0419 0.0456 0.0880 0.1200
0.1229 0.1279 0.1444 0.1456 0.1621 0.1672 0.1809 0.1855
0.1882 0.1917 0.2277 0.2442 0.2456 0.2476 0.2538 0.2552
0.2681 0.3041 0.3128 0.3810 0.3832 0.3969 0.4050 0.4182
0.4259 0.4365 0.4378 0.4434 0.4482 0.4515 0.4628 0.4637
0.4668 0.4773 0.4799 0.5100 0.5309 0.5391 0.6033 0.6283
0.6468 0.6519 0.6686 0.6689 0.6865 0.6961 0.7058 0.7305
0.7337 0.7339 0.7440 0.7485 0.7516 0.7607 0.7679 0.7765
0.7846 0.8153 0.8445 0.8654 0.8700 0.8732 0.8847 0.8935
0.8987 0.9070 0.9284 0.9308 0.9464 0.9658 0.9728 0.9872
Verify the hypothesis H0 that a realization from the
uniform distribution U(0, 1) was observed.
Divide the region of possible values (0, 1) into k = 5
intervals of equal length:
(0; 0.2), [0.2; 0.4), ..., [0.8; 1)
We have: Uj = 18, 12.16, 19, 15. Let us consider the wider
hypothesis H′
0 : πi = 0.2, i = 1, ..., 5. We obtain
X2
n =
1
n
k
X
i=1
U2
i
πi0
−n =
1
80
182 + 122 + 162 + 192 + 152
0.2
−80 = 1.875
The asymptotic P-value is pva = P{χ2
4  1.875) = 0.7587. The
data do not contradict the hypothesis H′
0. So we have no basis
Chi-squared Tests 25
for rejecting the hypothesis H0. The likelihood ratio test gives
the same result because the value of the statistic Rn is 1.93
and pva = P{χ2
4  1.93} = 0.7486.
Comment 2.7. Tests [2.8] and [2.10] are used not only for the
simple hypothesis H0 but also directly for the hypothesis H′
0
(see the following example).
Example 2.2. Over a long period it has been established that
the proportion of the first and second quality units produced in
the factory are 0.35 and 0.6, respectively, and the proportion of
defective units is 0.05. In a quality inspection, 300 units were
checked and 115, 165 and 20 units of the above-considered
qualities were found. Did the quality of the product remain
the same?
In this example U1 = 115, U2 = 165, U3 = 20, n = 300. The
zero hypothesis is
H′
0 : π1 = 0.35, π2 = 0.60, π3 = 0.05
The values of the statistics [2.9] and [2.5] are
Rn = 3.717, X2
n = 3.869
The number of degrees of freedom is k − 1 = 3 − 1 = 2.
The P-values corresponding to the likelihood ratio and
Pearson’s chi-squared tests are
pv1 = P{χ2
2  3.717} = 0.1559
and pv2 = P{χ2
2  3.869} = 0.1445
respectively. The data do not contradict the zero hypothesis.
26 Non-parametric Tests for Complete Data
2.3. Pearson’s goodness-of-fit test: composite
hypothesis
Suppose that X = (X1, ..., Xn)T is a simple sample obtained
by observing a random variable X with a cdf from the family
P = {F : F ∈ F}.
Composite hypothesis
H0 : F(x) ∈ F0 = {F0(x; θ), θ ∈ Θ} ⊂ F [2.12]
meaning that the cdf F belongs to the cdf class F0 of form
F0(x; θ); here θ = (θ1, ..., θs)T ∈ Θ ⊂ Rs is an unknown
s-dimensional parameter and F0 is a specified cumulative
distribution function.
For example, the hypothesis may mean that the probability
distribution of X belongs to the family of normal, exponential,
Poisson, binomial or other distributions.
As in the previous section, divide the abscissas axis into k 
s + 1 smaller intervals (ai−1, ai] and denote by Uj the number
of observations belonging to the j-th interval, j = 1, 2...., k.
The grouped sample U = (U1, ..., Uk)T has a k-dimensional
multinomial distribution Pk(n, π); here
π = (π1, ..., πk)T
πi = P{X ∈ (ai − 1, ai]} = F(ai) − F(ai−1), F ∈ F
If the hypothesis H0 is true then the wider hypothesis
H′
0 : π = π(θ), θ ∈ Θ
also holds; here
π(θ) = (π1(θ), ..., πk(θ))T
, πi(θ) = F0(ai; θ) − F0(ai−1; θ)
[2.13]
Chi-squared Tests 27
The last hypothesis means that the parameters of the
multinomial random vector U can be expressed as specified
functions [2.13] of the parameters θ1, ..., θs; s + 1  k.
The Pearson’s chi-squared statistic (see [2.5])
X2
n(θ) =
k
X
i=1
(Ui − nπi(θ))2
nπi(θ)
=
1
n
k
X
i=1
U2
i
πi(θ)
− n [2.14]
cannot be computed because the parameter θ is unknown. It is
natural to replace the unknown parameters in the expression
[2.14] by their estimators and to investigate the properties of
the obtained statistic. If the maximum likelihood estimator of
the parameter θ obtained from the initial non-grouped data is
used then the limit distribution depends on the distribution
F(x; θ) (so on the parameter θ).
We shall see that if grouped data are used then estimators
of the parameter θ can be found such that the limit
distribution of the obtained statistic is chi-squared with k −
s − 1 degrees of freedom, so does not depend on θ.
Let us consider several examples of such estimators.
1) Under the hypothesis H0, the likelihood function from
the data U and its logarithm are
L̃(θ) =
n!
U1! . . . Uk!
k
Y
i=1
πUi
i (θ), ℓ̃(θ) =
k
X
i=1
Ui ln πi(θ) + C [2.15]
so the maximum likelihood estimator θ∗
n of the parameter θ
from the grouped data verifies the system of equations
∂ℓ̃(θ)
∂θj
=
k
X
i=1
Ui
πi(θ)
∂πi(θ)
∂θj
= 0, j = 1, 2...., s [2.16]
28 Non-parametric Tests for Complete Data
Let us define the Pearson statistic obtained from [2.14].
Replacing θ by θ∗
n in [2.14] we obtain the following statistic:
X2
n(θ∗
n) =
k
X
i=1
(Ui − nπi(θ∗
n))2
nπi(θ∗
n)
[2.17]
2) Another estimator θ̃n of the parameter θ, called the
minimum chi-squared estimator, is obtained by minimizing
[2.14] with respect to θ
X2
n(θ̃n) = inf
θ∈Θ
X2
n(θ) = inf
θ∈Θ
k
X
i=1
(Ui − nπi(θ))2
nπi(θ)
[2.18]
3) To find the estimator θ̃n, complicated systems of
equations must be solved, so statistic [2.14] is often modified
by replacing the denominator by Ui. This method is called
the modified chi-squared minimum method. The estimator θ̄n
obtained by this method is found from the condition
X2
n(θ̄n) = inf
θ∈Θ
k
X
i=1
(Ui − nπi(θ))2
Ui
[2.19]
Besides these three chi-squared type statistics [2.17–2.19]
we may use the following.
4) The likelihood ratio statistic obtained from the grouped
data
Rn = −2 ln
supθ∈Θ L̃(θ)
supπ L(π)
= −2 ln
supθ∈Θ
Qk
i=1 πUi
i (θ)
supπ
Qk
i=1 πUi
i
= 2
k
X
i=1
Ui ln
Ui
nπi(θ∗
n)
Chi-squared Tests 29
This statistic can be written in the form
Rn = Rn(θ∗
n) = inf
θ∈Θ
Rn(θ), Rn(θ) = 2
k
X
i=1
Ui ln
Ui
nπi(θ)
[2.20]
We shall show that the statistics X2(θ̃n), X2(θ̄n), X2(θ∗
n)
and Rn(θ∗
n) are asymptotically equivalent as n → ∞.
Suppose that {Yn} is any sequence of random variables. We
write Yn = oP (1), if Yn
P
→ 0, and we write Yn = OP (1), if
∀ ε  0 ∃ c  0 : sup
n
P{|Yn|  c}  ε
Prokhorov’s theorem [VAN 00] implies that if ∃Y : Yn
d
→ Y ,
n → ∞, then Yn = OP (1).
Conditions A
1) For all i = 1, . . . , k and all θ ∈ Θ
0  πi(θ)  1, π1(θ) + · · · + πk(θ) = 1
2) The functions πi(θ) have continuous first- and second-
order partial derivatives on the set Θ.
3) The rank of the matrix
B =

∂πi(θ)
∂θj

k×s
, i = 1, ..., k, j = 1, ..., s,
is s.
Lemma 2.1. Suppose that the hypothesis H0 holds. Then under
conditions A the estimators πi(θ̃n), πi(θ̄n) and πi(θ∗
n) are
√
n-
consistent, i.e.
√
n(π̃in − πi) = OP (1).
30 Non-parametric Tests for Complete Data
Proof. For brevity we shall not write the argument of the
functions πi(θ).
Let us consider the estimator π̃in. Since 0 ≤ π̃in ≤ 1, we
have π̃in = OP (1). Since Ui/n
P
→ πi,, the inequalities (we use
the definition of the chi-squared minimum estimator)
k
X
i=1
(Ui/n − π̃in)2
≤
k
X
i=1
(Ui/n − π̃in)2
π̃in
≤
k
X
i=1
(Ui/n − πi)2
πi
= oP (1)
imply that for all i: Ui/n − π̃in = oP (1) and
π̃in − πi = (π̃in − Ui/n) + (Ui/n − πi) = oP (1)
Since
√
n(π̂i − πi)
d
→ Zi ∼ N(0, πi(1 − πi)), we have (Ui −
nπi)/
√
n =
√
n(π̂i − πi) = OP (1). So from the inequality
k
X
i=1
(Ui − nπ̃in)2
n
≤
k
X
i=1
(Ui − nπ̃in)2
nπ̃in
≤
k
X
i=1
(Ui − nπi)2
nπi
= OP (1)
we have that for all i: (Ui − nπ̃in)/
√
n = OP (1), and
√
n(π̃in − πi) =
nπ̃in − nπi
√
n
=
nπ̃in − Ui
√
n
+
Ui − nπi
√
n
= OP (1)
Analogously we obtain
k
X
i=1
(Ui − nπ̄in)2
Ui
≤
k
X
i=1
(Ui − nπi)2
Ui
≤
k
X
i=1
(
√
n(Ui/n − πi))2
Ui/n
= OP (1)
and
√
n(π̄in − πi) =
nπ̄in − Ui
√
n
+
Ui − nπi
√
n
= OP (1)
Chi-squared Tests 31
Let us consider the estimator π∗
in. θ∗
n is the ML estimator so
under the conditions of the theorem the sequence
√
n(θ∗
n −
θ) has the limit normal distribution. By the delta method
[VAN 00]
√
n(π∗
in − πi) =
√
n(πi(θ∗
n) − πi(θ)) =
π̇T
i (θ)
√
n(θ∗
n − θ) + oP (1) = OP (1)
△
Theorem 2.2. Under conditions A the statistics X2(θ̃n),
X2(θ̄n), X2(θ∗
n) and Rn(θ∗
n) are asymptotically equivalent as
n → ∞:
X2
(θ̃n) = X2
(θ̄n) + oP (1) = X2
(θ∗
n) + oP (1) = Rn(θ∗
n) + oP (1)
The distribution of each statistic converges to the chi-squared
distribution with k − s − 1 degrees of freedom.
Proof. Suppose that θ̂ is an estimator of θ such that
π̂n = (π̂1n, . . . , π̂kn)T
= (π1(θ̂), . . . , πk(θ̂))T
is the
√
n-consistent estimator of the parameter π. From the
definition of
√
n-consistency and the convergence Ui/n
P
→ πi
we have that for all i
π̂in −
Ui
n
= oP (1),
√
n

π̂in −
Ui
n

= OP (1),
Ui
n
= πi + oP (1)
Using the last inequalities, the Taylor expansion
ln(1 + x) = x − x2
/2 + o(x2
), x → 0
and the equality U1 + ... + Uk = n, we obtain
1
2
Rn(θ̂n) =
k
X
i=1
Ui ln
Ui
nπ̂i
= −
k
X
i=1
Ui ln

1 +
nπ̂i
Ui
− 1

=
32 Non-parametric Tests for Complete Data
−
k
X
i=1
Ui ln

1 +
π̂i − Ui/n
Ui/n

= −
k
X
i=1
Ui

π̂i − Ui/n
Ui/n

+
1
2
k
X
i=1
Ui

π̂i − Ui/n
Ui/n
2
+
k
X
i=1
UioP

π̂i − Ui/n
Ui/n
2
!
=
−n
k
X
i=1
π̂i +
k
X
i=1
Ui +
1
2
k
X
i=1
(Ui − nπ̂i)2
Ui
+ oP (1) =
1
2
k
X
i=1
(Ui − nπ̂i)2
Ui
+ oP (1) =
1
2
k
X
i=1
(Ui − nπ̂i)2
nπ̂i
−
1
2
k
X
i=1
(Ui − nπ̂i)3
Uinπ̂i
+ oP (1) =
1
2
k
X
i=1
(Ui − nπ̂i)2
nπ̂i
+ oP (1) =
1
2
X2
n(θ̂n) + oP (1)
Taking θ̂n = θ∗
n and θ̂n = θ̃n, the last equalities imply
X2
n(θ∗
n) = Rn(θ∗
n) + oP (1), X2
n(θ̃n) = Rn(θ̃n) + oP (1)
From the definition of θ̃n we obtain X2
n(θ̃n) ≤ X2
n(θ∗
n), and from
definition [2.20] of Rn we obtain Rn(θ∗
n) ≤ Rn(θ̃n). So
X2
n(θ̃n) ≤ X2
n(θ∗
n) = Rn(θ∗
n) + oP (1) ≤
Rn(θ̃n) + oP (1) = X2
n(θ̃n) + oP (1)
These inequalities imply
X2
n(θ̃n) = Rn(θ∗
n) + oP (1)
Analogously
X2
n(θ̄n) = Rn(θ∗
n) + oP (1)
The (k − 1)-dimensional vector (π1, . . . , πk−1)T is a function
of the s-dimensional parameter θ, so the limit distribution of
Chi-squared Tests 33
the likelihood ratio statistic Rn(θ∗
n) is chi-squared with k−s−1
degrees of freedom (see Appendix A, comment A.4). The same
limit distributions have other considered statistics.
△
The theorem implies the following asymptotic tests.
Chi-squared test: hypothesis H′
0 is rejected with an
asymptotic significance level α if
X2
(θ̂n)  χ2
α(k − 1 − s) [2.21]
here θ̂n is any of the estimators θ̃n, θ∗
n, θ̄n.
Likelihood ratio test: hypothesis H′
0 is rejected with an
asymptotic significance level α if
Rn(θ∗
n)  χ2
α(k − 1 − s) [2.22]
If hypothesis H′
0 is rejected then the hypothesis H0 is also
rejected.
Example 2.3. In reliability testing the numbers of failed units
Ui in time intervals [ai−1, ai), i = 1, ..., 11, were fixed.
The data are given in the following table.
i (ai−1, ai] Ui i (ai−1, ai] Ui
1 (0,100] 8 7 (600,700 ] 25
2 (100,200] 12 8 (700,800] 18
3 (200,300] 19 9 (800,900] 15
4 (300,400] 23 10 (900,1000] 14
5 (400,500] 29 11 (1000,∞) 18
6 (500,600] 30
34 Non-parametric Tests for Complete Data
Verify the hypothesis stating that the failure times have
the Weibull distribution.
By [2.20], the estimator (θ∗
n, ν∗
n) minimizes the function
Rn(θ, ν) = 2
k
X
i=1
Ui ln
Ui
nπi(θ, ν)
, πi(θ, ν) = e−(ai−1/θ)ν
− e−(ai/θ)ν
By differentiating this function with respect to the parameters
and equating the partial derivatives to zero, the following
system of equations is obtained for the estimators θ∗
n and ν∗
n
k
X
i=1
Ui
aν
i−1e−(ai−1/θ)ν
− aν
i e−(ai/θ)ν
e−(ai−1/θ)ν
− e−(ai/θ)ν = 0
k
X
i=1
Ui
aν
i−1 e−(ai−1/θ)ν
ln ai−1 − aν
i e−(ai/θ)ν
ln ai
e−(ai−1/θ)ν
− e−(ai/θ)ν = 0
By solving this system of equations or directly minimizing the
function Rn(θ, ν), we obtain the estimators θ∗ = 649.516 and
ν∗ = 2.004 of the parameters θ and ν. Minimizing the right-
hand sides of [2.18] and [2.19] we obtain the values of the
estimators θ̃ = 647.380, ν̃ = 1.979 and θ̄ = 653.675, ν̄ = 2.052.
Using the obtained estimators, we have the following values
of the test statistics
Rn(θ∗
, ν∗
) = 4.047, X2
n(θ∗
, ν∗
) = 4.377, X2
n(θ̃, ν̃) = 4.324 and
X2
n(θ̄, ν̄) = 3.479
The number of degrees of freedom is k − s − 1 = 8. Since the
P-values – 0.853; 0.822; 0.827; 0.901 – are not small, there is
no reason to reject the zero hypothesis.
2.4. Modified chi-squared test for composite
hypotheses
The classical Pearson’s chi-squared test has drawbacks,
especially in the case of continuous distributions.
Another Random Document on
Scribd Without Any Related Topics
lies is merely selling, thus not requiring proof of guilty knowledge.
It has been contended that the requirement of guilty knowledge in 8
Geo. II. c. 13, should be read into 17 Geo. III. c. 57, and the action
of damages provided by the latter statute applied to guilty selling
only. This contention has been rejected as erroneous.[848]
Limitation of Action.—Actions for penalties under the Acts must be
brought within three months of the discovery of the offence sued
on[849] and within six months after the committal of such offence.
[850]
There is no express limitation in the Acts in respect of actions for
damages under 17 Geo. III. c. 57, and therefore such action will not
be barred for six years.[851]
Costs.—The litigant if successful in an action for infringement is to
recover full costs.[852] This proviso, however, has been construed
to mean nothing more than ordinary costs taxed as between party
and party.[853] Probably, however, they may be claimed as of right
and are not in the discretion of the Court under Rules of the
Supreme Court, o. 65, r. 1.[854]
Copying for Private Use will probably be actionable under 17 Geo.
III. c. 57;[855] but no penalties could be recovered under 8 Geo. II.
c. 13, as under that Act the making must be a making for sale.
What is a Piratical Copy.—The right under the Acts is the sole
right and liberty of printing and reprinting the same,[856] and the
prohibition is against engraving, etching, or working in mezzotinto
or chiaro oscuro or otherwise, or in any manner copying, in the
whole or in part, by varying, adding to or diminishing from, the main
design.[857]
[157]
The taking of a material part is a piracy;[858] the copy which
contains a material part of a copyright engraving is a piratical copy,
and it is an offence to import or sell it.[859]
The copyright in an engraving may be infringed otherwise than by
another engraving. Thus a photograph of an engraving is an
infringement of the copyright in it.[860]
It is doubtful how far the Engraving Acts protect the design in an
engraving. It is clear that when an engraving is taken from a work of
art previously existing, such as a pen and ink drawing or a painting,
the engraving is only copyright so far as the work of the
engraver[861] is concerned; that is to say, apart from the copyright in
the drawing or painting, which may or may not be his, the engraver
acquires no monopoly[862] of the right to engrave the picture; the
fact of his being the first engraver does not prevent others from
doing the same, they can only be prevented from copying from his
engraving the peculiar execution of the design. In Dicks v.
Brooks[863] a printed pattern for Berlin wool work was taken from an
engraving of the well-known picture The Huguenot, by Millais. The
owner of the copyright in the engraving sued for infringement. It
was held that the printed pattern constituted no infringement of his
engraving; it contained no reproduction of that which was the
engraver's meritorious work in the print. But if the whole invention
and design of the engraving is the engraver's own do the Engraving
Acts protect the engraver in such design and invention? There is no
authority where the point has been expressly considered and
decided. It is suggested that the Engraving Acts protect that part of
an engraving only which is the result of the engraver's peculiar art;
for the rest, for the design, for the invention, for the grouping of the
figures, protection can only be obtained under the Act protecting
drawings, or (in the case of maps) under the Literary Copyright Act,
or at common law. In Roworth v. Wilkes[864] Lord Ellenborough
considered a copying of the design was an infringement of copyright
[158]
under the Engraving Acts. The action was in respect of an alleged
infringement of certain plates in a treatise on fencing. These plates
had been copied in so far as the position of the figures went, but
they were represented as differently dressed. His Lordship, in
directing the jury, said:
As to the prints, the question will be whether the defendant
has copied the main design ... it is still to be considered whether
there be such a similitude and conformity between the prints
that the person who executed the one set must have used the
others as a model. In that case he is a copyist of the main
design. But if the similitude can be supposed to have arisen
from accident, or necessarily from the nature of the subject, or
from the artist having sketched designs merely from reading the
letterpress of the plaintiffs work, the defendant is not
answerable. It is remarkable, however, that he has given no
evidence to explain the similitude or to repel the presumption
which that necessarily causes.
In Martin v. Wright[865] it was held that when an artist had from
sketches of his own produced an engraving, and the defendant had
it copied on canvas in colours on a very large scale, with dioramic
effect, and publicly exhibited it, such a copying and exhibiting was
no infringement of the engraving. The ground of this decision seems
to have been partly that the merit of the new work had absorbed the
merit of the old. Thus Shadwell, V. C., prefaces his judgment with
the remark that any person may copy and publish the whole of a
literary composition provided he writes notes upon it, so as to
present it to the public connected with matter of his own.[866]
Another ground of the decision seems to have been that the diorama
was produced for purposes of exhibition and not of sale. The real
point, whether the Acts protected more than that which was peculiar
to the engraver's art, does not appear to have been considered
either in the argument or judgment. In Dicks v. Brooks[867] James,
L. J., appears to have been of opinion that 8 Geo. II. c. 3, in
[159]
protecting the work of an engraver where the invention and design
was his own, protected not only the work peculiar to the engraver's
art, but the invention and design of the pictures as well.
These words were intended to give protection for the genius
exhibited in the invention of the design, and the protection was
commensurate with the invention and design.[868]
Bramwell, L. J., however, seems inclined towards the opposite view.
He says:
I do not say that if this were an ordinary engraving with no
picture, a lithograph taken from it would not be a copy. I think
that a photograph taken from it would be a copy. I do not say
that if this were an original engraving with no picture, and a
copy were made of it and afterwards coloured there might not
be some ground for saying that there was a piracy of the art
and skill of the engraver. I should have very great misgiving
about it, because I doubt whether the statutes were not
intended to protect the artist's skill as an engraver only, and not
as a draftsman.[869]
It is no defence to an action for infringement that the work has been
extensively added to or improved.[870]
Striking prints from the proprietor's own plate has been held not to
be an infringement, although it was clearly an unauthorised act and
a breach of contract.[871] Thus a printer who had plates in his
possession would not infringe the copyright and be liable to
penalties by striking copies for his own use, but he would be liable in
damages for breach of contract.
Licence a Defence.—A licence in order to be a defence must be in
writing signed by the proprietor in the presence of two or more
credible witnesses,[872] but a licensee who is also a purchaser of any
plates for printing may presumably without any document in writing
print from the said plates without incurring penalties[873] under 8
Geo. II. c. 13 or 7 Geo. III. c. 38, but quære whether such
purchaser would not technically be liable to damages under 17 Geo.
III. c. 57. A bare licensee, although a purchaser of plates, could not
authorise third persons to print from the plates except as his agent
and on his behalf.[874]
[160]
[161]
CHAPTER VII
COPYRIGHT IN SCULPTURE
Section I.—What Works are Protected.
The following works are protected under the Sculptures Act:
1. Every original sculpture:[875]
2. First published within the British dominions:[876]
3. [The author of which is a British subject or resident within
the British dominions]:[877]
4. Which bears the proprietor's name and the date [of first
publication] thereon:[878]
5. And is innocent.[879]
Protection endures for fourteen years from publication, and another
term of fourteen years if the author is then alive and retains the
copyright.[880]
Protection is probably limited by implication to the United Kingdom.
[881]
What is an Original Sculpture.—The work protected is any new
and original sculpture, or model, or copy, or cast of the human figure
or human figures, or of any bust or busts or of any part or parts of
the human figure clothed in drapery or otherwise, or of any subject
being matter of invention in sculpture, or of any alto or basso-relievo
representing any of the matters or things hereinbefore mentioned,
or any cast from nature of the human figure or of any part or parts
of the human figure, or of any cast from nature of any animal or of
any part or parts of any animal, or of any such subject containing
any of the matters or things hereinbefore mentioned, whether
separate or combined.[882]
In one case it was contended that the Act only applied to
representations of human figures and animals. North, J., however,
held that any new and original sculpture applied to any subject
being matter of invention in sculpture, and that casts of fruit and
leaves used for instruction in drawing were protected.[883]
Carefully modelled toy soldiers have been protected as works of
sculpture.[884]
The Sculpture must be First Published within the British
Dominions.—The Act provides that protection shall run from the
first publication of the work.[885] Before 1886 it is possible that first
publication within the United Kingdom was required, now first
publication anywhere within the British dominions will vest the
copyright;[886] first publication outside the British dominions will
destroy it.[887]
Publication.—A work of sculpture is published when the eye of the
public[888] is allowed to rest upon it, that is to say when the
sculpture itself and not merely a photographic copy or sketch is so
exhibited that the general public have an opportunity of viewing it.
[889] Exhibition in any public gallery such as the Royal Academy
would be publication; but a private view in the artist's studio would
not be publication.
Author's Nationality.—It is extremely doubtful whether the author
must not at the time of first publication bear some allegiance to the
[162]
crown by virtue of nationality or residence. If this is so in the case of
books,[890] there seems to be no good ground for saying that the
statute as to sculpture[891] was intended to be more generous to the
foreigner than that as to books.[892]
Proprietor's Name and Date.—The protection given by the
Sculpture Act is conditional on the proprietor or proprietors having
caused his, her, or their name or names with the date to be put on
every sculpture before the same shall be put forth or published.[893]
Proprietor's Name.[894]—As to what will probably be a sufficient
statement of the proprietor's name, see the cases on engravings[895]
on which also the proprietor's name is required. As to this provision
the two statutes seem to be in pari materia and the cases equally
applicable to both.
Date.—It is not stated what date: but there can be no reasonable
doubt but that the date of first publication is intended. The older
statute governing sculptures[896] (now repealed) required the
proprietor's name and date of publication. The International Act, 7
 8 Vict. c. 12, in reciting the provisions as to sculptures, runs and
by the said Acts[897] it is provided that the name of the proprietor,
with the date of first publication thereof, is to be put on all such
sculptures. It should be noticed, however, that both statutes were
then in operation and 38 Geo. III. c. 71 had not yet been repealed,
so that the recitation in 7  8 Vict. c. 12 may apply only to the
provision in 38 Geo. III. c. 71, and is not necessarily explanatory of
54 Geo. III. c. 36. There can be no doubt, however, that the
omission in 54 Geo. III. c. 56 to state what date was required was
an oversight, and everything points to its being the date of first
publication that is meant. The statutory protection begins then, and
from then the duration of the copyright is measured so that there is
strong reason for the public being apprised of the date of first
publication, while the date of making, which is the only other
[163]
conceivable date, is of no importance. When the date affixed was a
date a few days before publication, Wright, J., held it was
immaterial, as it would only shorten the term of the copyright.[898]
Immoral Works.—Profane, libellous, or indecent works will not be
protected. There are no direct authorities in respect of unlawful
works of sculpture, but as in books,[899] paintings,[900] and
engravings[901] the general policy of the law not to take an account
between wrong-doers will apply.
Duration of Protection.—Statutory protection commences on
publication.[902] Before publication the unpublished work will be
protected at common law from any use which may be made of it
without the permission of the owner. After publication the statutory
protection alone exists and subsists for fourteen years[903] with a
further term of fourteen years if at the expiration of the first term
the person who originally made or caused the sculpture to be made
is alive and has not parted with the copyright.[904]
Section II.—The Owner of the Copyright.
The Artist.—If a work of sculpture is made by an artist on his own
behalf he becomes on publication the proprietor of the copyright if
before publication he has not assigned his interest in the work.
The Employer.—If one procures an artist to make a work of
sculpture for him the employer will be ab initio the owner of the
copyright without any necessity for assignment from the artist. In
order so to vest the work the employer, it would seem, requires to
take no part in the invention or design of the work. If he causes the
work to be done, he comes within the Act. No valuable consideration
need be shown.
[164]
The Assignee.—Assignment must be under seal, i. e. by a deed in
writing signed by the proprietor in the presence of and attested by
two or more credible witnesses.[905]
Section III.—Infringement of the Copyright.
Prohibited Acts and Remedies.—The Act (54 Geo. III. c. 56)
gives to the proprietor the sole right and property of works in
sculpture.
The prohibited Acts are[906]—
1. Making a pirated copy.
2. Importing a pirated copy.
3. Exposing for sale or otherwise disposing of a pirated copy.
4. Causing any of these acts to be done.
The remedy is an action at the suit of the proprietor for[907]—
i. Damages.
ii. Injunction.
iii. Costs—a full and reasonable indemnity.[908]
Guilty Knowledge.—Ignorance is no defence to an action in respect
of any of the prohibited Acts, even that of selling.
Limitation of Action.—All actions under the Act must be commenced
within six months of the discovery of the offence sued on.
Copying for Private Use.—Either making or importing a single copy
for private use would technically be an infringement. The prohibition
[165]
is not limited to making or importing for sale, hire, exhibition, or
distribution, as in the case of paintings, c., under 25  26 Vict. c.
68, sec. 6.
What is a Piratical Copy.—A pirated copy may be produced by
moulding or copying from or imitating in any way any of the matters
or things put forth or published under the protection of the Act ... to
the detriment, damage, or loss of the proprietor.[909]
The prohibition is against imitating in any way. This prohibition
does not seem so wide as that in 25  26 Vict. c. 68, which prohibits
the multiplication of a painting or drawing or the design thereof. It is
more similar to the prohibition in the Engraving Act 8 Geo. II. c. 13,
viz., against engraving, c., or in any manner copying a copyright
print. It seems therefore to be open to question as with engravings
whether a piece of sculpture can be infringed except by some work
of art which reproduces the peculiar art of the sculptor. Would a
piece of sculpture be infringed by a picture, sketch, or engraving
copying the design of the work?
Licence would be a defence, and it probably does not require to be
in writing. There is nothing in the Act from which the necessity for a
licence to be in writing could be implied.
[166]
[167]
CHAPTER VIII
COPYRIGHT IN PAINTINGS, DRAWINGS,
AND PHOTOGRAPHS
Section I.—What Works are Protected.
The following works are protected under the Fine Arts Copyright Act,
1862:
1. Every original painting, drawing, and photograph:[910]
2. Not first published outside the British Dominions:[911]
3. The author of which is a British subject, or is resident
within the dominions of the crown [when the work is
made]:[912]
4. Which has been registered before infringement:[913]
5. And is innocent.[914]
Protection vests at the date of making, and endures for the author's
life and seven years.[915]
Protection is limited to the United Kingdom.[916]
Every Original Painting, Drawing, and Photograph.—There is
no attempt to define what is a painting, drawing, or photograph
within the meaning of the Act.[917] The substances used in the
making are no doubt immaterial, so long as the result is ejusdem
generis with what is ordinarily meant by a picture, drawing, or
photograph. A painting on the wall of a house would doubtless be
protected, but not a design created by grouping figures in a tableau
vivant.[918]
Originality as an essential of protection means that there must be
something either in the design or execution of the work which is not
merely copied from some other artistic work. The whole work need
not be original. Thus the execution may be original but not the
design, as in the case of a photograph of an old picture;[919] or part
only of the design may be original, as in the case of the design of an
old drawing added to or altered. In so far as the work is new there
will be protection, but in so far as it is old there will be no
protection.[920]
Artistic Merit.—The Court will not inquire as to whether a painting,
drawing, or photograph is good, bad, or indifferent. If it consists in
the representation of some object by means of light and shade or
colour, it will suffice, and even the coarsest or the most
commonplace, or the most mechanical representation of the
commonest object would be protected so that an exact reproduction
of it, such as photography, for instance, would produce, would be an
infringement of copyright.[921] Probably there must be a
representation of some concrete object, real or imaginary.
Protection, for instance, was refused to a label for Eau de Cologne,
[922] which merely bore the legend Johanna Maria Farina
gegenüber dem Julichs Platz, written in copperplate with sundry
dots and flourishes. It was held that any one who had a right to sell
Farina's Eau de Cologne might manufacture and use the label, since
although the label was a trade mark there was no copyright in it. A
label with anything in the nature of a picture on it would
undoubtedly be copyright, as the use to which a work of art is put is
immaterial, but it is doubtful whether a label containing merely
[168]
geometrical figures and fancy dots and lines would be protected
under the Act of 1862. Probably it would not.
Publication outside the British Dominions.—Copyright in works
of art under the Act of 1862 begins on the making thereof, and is
not dependent on publication. It is immaterial where the work is
made, whether in the British dominions or elsewhere, and it would
be as immaterial where it was first published, or whether it was
published or not, but for the provision of the International Copyright
Act, 1844. Section 19 of this Act provides that the maker of a work
of art which shall be first published out of the British dominions shall
not have copyright therein otherwise than such as he may become
entitled to under the International Acts; which means that where
there is no treaty a work first published abroad is not protected at
all. The result of this section was evidently not contemplated when
the Fine Arts Act, 1862, was framed. There seems to be no doubt
that the work, wherever made, will acquire copyright immediately on
the making, but that that copyright may be lost if the work is
published abroad before it is published in the British dominions.
Published.—A painting, drawing, or photograph is probably published
when it is so exhibited as to give the public an opportunity of
viewing it. The leading case on publication of works of art is Turner
v. Robinson[923] in the Court of Chancery in Ireland. This case was
decided before 1862, and therefore before there was any statutory
copyright in paintings. The subject-matter was a painting from which
certain stereoscopic views had been taken without the proprietor's
consent. The painting had been previously, with the consent of the
proprietor, published in the form of an engraving in a magazine, and
exhibited at the Royal Academy in London and in Manchester. It was
then exhibited with the proprietor's consent in Dublin for the
purpose of obtaining contributors to a proposed engraving, and
while so exhibited the defendant, without consent, copied it and
produced his stereoscopic photographs. The Master of the Rolls[924]
thought that the picture had never been published, because the
[169]
exhibitions to the public in the Academies and in Dublin were on the
condition that no copies should be taken, and the engraving in the
magazine was not a publication of the picture, but only of a rough
representation of it. He therefore held that the common law right in
the picture had not been lost by publication, and that the proprietor
could recover against the taker of the stereoscopic views as against
an infringer of common law rights. The Court of Appeal in Chancery
upheld the judgment of the Master of the Rolls, but on different
grounds. They said it was unnecessary to decide whether there had
been publication in London and Manchester since, in their opinion,
the act of the defendant in taking stereoscopic views from the
painting was a breach of faith. He was admitted to the view in
Dublin for one purpose only, i. e. to become if he wished a
subscriber to an engraving; but he abused his privilege by taking a
copy of the painting which might well compete with the plaintiff's
proposed engraving. The defendant was, therefore, restrained on
the ground of breach of faith or implied contract. In his judgment
the Lord Chancellor disapproved of the view of the Master of the
Rolls that there had been no publication in London or Manchester.
He thought exhibition in the Academy, even although to a certain
extent conditional, would be sufficient publication to vest the
copyright, e. g. in a work of sculpture under the statutes applicable
to such works. Exhibition in a public gallery, therefore, would be
publication, but not a private view in the artist's studio to which only
a small and selected portion of the public are invited. Whether the
publication of a print would be publication of the picture from which
it was taken, quære; the Master of the Rolls thought not, and on this
point the Court of Appeal neither approved nor disapproved.
Nationality or Residence of Artist.—The protection of the Act is
expressly limited to the works of British subjects and of such
foreigners as are resident within the dominions of the Crown.[925]
There is no direction in the statute as to the time when the author
must possess the requisite nationality or residence. Must it be at the
time of making or at the time of publishing, or both? It is submitted
[170]
that it must be at the time of making, since copyright in the work
vests at that time, and there may never be publication at all. There
seems to be no reason for suggesting that the date to be looked at
is the date of publication, except that the next words in the section
provide that the work may be made anywhere, and the proviso as to
the residence of the author, if applied at the date of making, means
that—
1. A work by a British subject may be made anywhere; but,
2. A work by an alien must be made within the dominions of
the Crown.
There does not seem to be anything absurdly contradictory in this,
and there is, on the other hand, a patent absurdity in not being able
to determine whether the author is an author within the Act until
long after the right has begun to run.
Registration.—A condition precedent to protection is registration in
the book kept at the Hall of the Stationers' Company.
The Requisite Entry.—There must be registered:
1. Name and place of abode of the author.
2. Name and place of abode of the proprietor.
3. Short description of the nature and subject of the work.
And if desired,
4. A sketch outline or photograph of the work.
The wording of section 4 of the Act of 1862 providing for compulsory
registration is very confused, the requirements on first registration
being unaccountably mixed up with the requirements on subsequent
assignment.
[171]
On first registration whenever it takes place it is submitted that the
particulars entered should be as above.[926] The author and
proprietor may very likely be the same individual, in which case the
one name will be entered twice, once under each description. It
would probably not be sufficient merely to enter the author's name
once as author and leave it to be implied that he is the owner. Even
if the author and proprietor are different persons, either because the
author has been employed for valuable consideration or because he
has granted an assignment, the particulars to be entered on first
registration are the same, no entry of the terms of employment or
assignment being necessary.[927] The real proprietor must be on the
register, and if the wrong person is registered as proprietor it will not
give a cause of action to join such person as co-plaintiff with the real
proprietor who is not on the register.[928]
As in the Literary Copyright Act, copyright in the work exists before
registration, but no action is maintainable without registration, and
under this Act even after registration there is no remedy in respect
of infringement committed before registration.[929]
It need hardly be said that the necessity of registration only applies
to an action on copyright proper, and an action will without
registration lie on breach of contract, express or implied,[930] and
probably on the common law right of an author and his assigns in
unpublished work.[931]
If an unauthorised copy is made before the proprietor is registered
but sold afterwards, an action for damages will lie for the offence of
selling such copies, but no action for penalties.[932] No action at all
will lie for making.[933]
If an action is brought by an assignee, such assignee must be on the
register as proprietor,[934] and it will not avail to join as co-plaintiff
an unregistered assignee with the assignor who although registered
[172]
has parted with the copyright.[935] An assignee taking from a
registered assignor probably cannot sue in respect of acts of
infringement committed before the registration of the assignment.
[936] It is not necessary that the original proprietor, whether author
or employer, should have been registered,[937] but once registration
has been effected it would seem that all future assignments must be
entered on the register.[938]
The registration by an assignee under an assignment, subsequent to
first registration, must contain the following particulars:[939]
1. Date of assignment.
2. Names of parties to the assignment.
3. Name and place of abode of the assignee.
4. Name and place of abode of the author.
5. Short description of nature and subject of the work.
And if desired,
6. A sketch outline or photograph of the work.
The enactments of 5  6 Vict. c. 45 (the Literary Copyright Act) as to
1. Keeping the Register Book;
2. Searches and certified copies therefrom;
3. False entries;
4. Application to expunge,
apply mutatis mutandis to registration of paintings, drawings, and
photographs.
The charge for making an entry is one shilling.
[173]
Name.—The trading style of a firm is a sufficient registration of the
name of a proprietor.
Place of Abode.—The place where a man can readily be found on
inquiry is sufficient. A business address is a place of abode within
the statute.
Short Description of the Nature and Subject of the Work.—The title
of the work will sometimes be a sufficient description. The following
were held sufficient descriptions of Sir John Millais' well-known
pictures, viz.: Painting in oil, 'Ordered on Foreign Service';
Painting in oil, 'My First Sermon'; Photograph, 'My Second
Sermon.'[940] Blackburn, J., said:
It is the object of the legislature that enough be stated to
identify the production, and that the registration must be bonâ
fide, that a man shall not first claim one thing and then sue for
another. The description must be such as shall earmark the
subject.... The picture 'Ordered on Foreign Service' represents
an officer who is ordered abroad taking leave of a lady, and no
one can doubt that is the picture intended.... There may be a
few instances in which the mere registration of the name of the
picture is not sufficient: for instance, Sir Edwin Landseer's
picture of a Newfoundland dog might possibly be insufficiently
registered under the description of 'A Distinguished Member of
the Humane Society.' So also of a bullfinch and a couple of
squirrels described as 'Piper and a Pair of Nut-crackers.' ... It
would be advisable for a person proposing to register to add a
sketch or outline of the work.[941]
In the learned judge's opinion deficient description although it would
not be sufficient in itself, may be made sufficient by the addition of a
photograph, sketch, or outline. It would seem, however, that there
must be a description of some kind, and that a photograph or sketch
would not by itself be sufficient.
[174]
Immoral Works.—There will be no copyright in profane, libellous,
or indecent[942] works of art.
Duration of Protection.—The copyright under the Fine Arts Act
endures for the term of the natural life of the author and seven
years after his death.[943]
Copyright will cease if and when any painting or drawing or the
negative of any photograph is sold by the first owner thereof without
either the express reservation in writing of such copyright to the
vendor signed by the vendee or his agent, or the express
assignment in writing of such copyright to the vendee signed by the
vendor or his agent.[944]
The copyright will also cease (probably) if the work is published out
of the British dominions before publication within the dominions.[945]
Section II.—The Owner of the Copyright.
The Author.—The copyright is given to the author and his
assigns, except when the work is executed for or on behalf of any
other person for a good or valuable consideration.[946] The author is
the actual artist whose mind has created the work.[947] The giving
of ideas and suggestions to another is not sufficient to constitute an
author,[948] but, on the other hand, there might be an author who
had done little or nothing of the manual work required in the
execution. In Nottage v. Jackson the question of authorship in works
of art was fully discussed. Brett, M. R., said:
The author of a painting is the man who paints it, the author of
a drawing is the man who draws it,... of a photograph the
author is the person who effectively is as near as he can be the
cause of the picture which is produced, that is, the person who
has superintended the arrangement, who has actually formed
[175]
the picture by putting the people into position and arranging the
place in which the people are to be—the man who is the
effective cause of that. Although he may only have done it by
standing in the room and giving orders about it, still it is his
mind and act, as far as anybody's mind and act are concerned,
which is the effective cause of the picture such as it is when it is
produced.
Cotton, L. J., in the same case, said:
In my opinion 'author' involves originating, making, producing,
as the inventive or master mind, the thing which is to be
protected, whether it be a drawing or a painting or a
photograph.... It is not the person who suggests the idea but
the person who makes the painting or drawing who is the
author.
The Employer.—When an artistic work, protected by 25  26 Vict.
c. 68, is executed by the author for or on behalf of any other person
for a good or valuable consideration, the copyright vests in the
employer and his assigns, unless it be expressly reserved to the
author by agreement in writing signed by the employer.[949] This
provision applies to the everyday case of a person employing and
paying a painter or photographer to take his portrait. The copyright
vests in the customer.[950] The case, however, is not always so
simple. Difficult questions arise where the artist, usually a
photographer, requests the sitter, probably an actress or athlete, to
allow his portrait to be taken on the understanding that the artist
may publish and sell copies.[951] The sitter probably receives free
copies or copies at a reduced price. The difficulties to be solved are
purely questions of fact in each case, viz.:
1. Was the portrait taken for or on behalf of some person
other than the artist?
2. Did the artist receive good and valuable consideration?
As a rule, where a photographer invites celebrities to sit for him, the
understanding will be that the portrait is taken on the
photographer's behalf;[952] but at the same interview some plates
might be taken on behalf of the photographer and some on behalf of
the sitter.[953] The valuable consideration received by the
photographer need not be a money payment, but may consist
merely in the right given to him to publish and sell copies.[954]
When a managing director of a company employed A to make
drawings for a trade catalogue, the letterpress of which he wrote
himself, it was held that he was acting merely as agent for the
company, and that as the drawings were made not on his behalf but
on behalf of the company he was not the proprietor.[955]
The Assignee.—Assignment is required to be by some note or
memorandum in writing signed by the proprietor of the copyright or
his agent appointed for that purpose in writing.[956] Registration is
not necessary to effect assignment,[957] although the assignee must
be registered before he can sue.[958]
No particular words are required in an assignment,[959] but there
must be a present grant and not only an executory contract.[960]
Partial Assignment.—It is doubtful whether a copyright can be
partially assigned, either limited as to a copying of a particular kind
or limited as to place or time.[961] What is called by the parties an
assignment may only amount to a licence. In Lucas v. Cooke[962] the
proprietor of the copyright in a picture granted the following
document to an engraver: I assign to you for the purposes of an
[176]
engraving of one size the copyright of the picture painted by Mr. E.
V. Eddie, entitled Going to Work, and being a portrait of my
daughter. Fry, J., said:
The result of this instrument in my view was that after the
preparation of the engraving and the registration, Mr. Lucas (the
engraver) became the owner of the copyright of the print or
engraving, and Mr. Halford remained the owner of the copyright
of the painting.
It was held that the engraver, in order to succeed against a copyist,
would have to show that the alleged infringement was a copy of his
engraving, another copy of the picture itself was no infringement of
his rights. The transaction was a licence, and probably a licensee can
never sue in his own name. In one case,[963] however, Mathew, J.,
held that a sole licensee for a limited time could sue, and did not
require to be registered. The plaintiff had acquired from the
proprietor of the copyright in a picture the sole right to reproduce it
in chromo for two years. The defendants also produced a chromo of
the picture taken directly from the picture and not from the plaintiff's
chromo. Mathew, J., held that the plaintiff, as sole licensee, was
entitled to prevent any one infringing his right, and that being a
licensee and not an assignee, his name was not required to be on
the register. This is a very doubtful decision.
Section III.—Infringement.
Prohibited Acts and Remedies.—The right given is the sole and
exclusive right of copying, engraving, reproducing, and multiplying a
painting or drawing and the design thereof, or a photograph and the
negative thereof by any means and of any size.[964]
It is an offence for the author having parted with the copyright, or
for any other person not being the proprietor[965]—
[177]
1. To repeat, copy, colourably imitate or otherwise multiply for
sale, hire, exhibition, or distribution.
2. Knowingly to import into the United Kingdom, or sell,
publish, let to hire, exhibit, or distribute, or offer for
sale, hire, exhibition, or distribution any copy
unlawfully made.
And for any of the above offences an action lies at the instance of
the proprietor for[966]—
i. Sum not exceeding £10 on each copy made or dealt with.
[967]
ii. Forfeiture of copies to the proprietor.[968]
iii. Inspection and account.[969]
iv. Damages.[970]
v. Injunction.[971]
Penalties and forfeiture of copies may also be obtained by summary
proceedings before any two justices having jurisdiction where the
party offending resides.[972]
It is further an offence—
3. Innocently to import or sell, publish, let to hire, exhibit, or
distribute, or offer for sale, hire, exhibition, or
distribution any copy made without the owner's
consent.
For any of which an action lies at the instance of the proprietor of
the copyright for[973]—
[178]
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

More Related Content

PDF
probability_stats_for_DS.pdf
PDF
Navarro & Foxcroft (2018). Learning statistics with jamovi (1).pdf
PDF
Introduction to Statistical Learning with Appliations in R.pdf
PDF
ISLRv2_website.pdf
PDF
libro para asignatura de regresion lineal
PDF
biometry MTH 201
PDF
Diederik Fokkema - Thesis
PDF
A First Course in experimental design and analysis.pdf
probability_stats_for_DS.pdf
Navarro & Foxcroft (2018). Learning statistics with jamovi (1).pdf
Introduction to Statistical Learning with Appliations in R.pdf
ISLRv2_website.pdf
libro para asignatura de regresion lineal
biometry MTH 201
Diederik Fokkema - Thesis
A First Course in experimental design and analysis.pdf

Similar to Nonparametric Tests For Complete Data Vilijandas Bagdonavicius (20)

PDF
Mth201 COMPLETE BOOK
PDF
PhDKJayawardana
PDF
An Introduction to Statistical Inference and Its Applications.pdf
PDF
Robustness In Data Analysis Draft Georgy L Shevlyakov Nikita O Vilchevski
PDF
Memoire antoine pissoort_aout2017
PDF
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails an...
PDF
Randomly generated book
PDF
Notes on probability 2
PDF
Classification System for Impedance Spectra
PDF
thesis
PDF
Probability and Statistics by sheldon ross (8th edition).pdf
PDF
MLBOOK.pdf
PDF
book.pdf
PDF
High dimensional probability proc of the fourth international conference Evar...
PDF
The gage block handbook
PDF
Mansour_Rami_20166_MASc_thesis
PDF
Introductory Statistics Explained.pdf
PDF
David_Mateos_Núñez_thesis_distributed_algorithms_convex_optimization
PDF
Optimization and prediction of a geofoam-filled trench in homogeneous and lay...
PDF
Homogenization Of Coupled Phenomena In Heterogenous Media Jeanlouis Auriault
Mth201 COMPLETE BOOK
PhDKJayawardana
An Introduction to Statistical Inference and Its Applications.pdf
Robustness In Data Analysis Draft Georgy L Shevlyakov Nikita O Vilchevski
Memoire antoine pissoort_aout2017
Backtesting Value at Risk and Expected Shortfall with Underlying Fat Tails an...
Randomly generated book
Notes on probability 2
Classification System for Impedance Spectra
thesis
Probability and Statistics by sheldon ross (8th edition).pdf
MLBOOK.pdf
book.pdf
High dimensional probability proc of the fourth international conference Evar...
The gage block handbook
Mansour_Rami_20166_MASc_thesis
Introductory Statistics Explained.pdf
David_Mateos_Núñez_thesis_distributed_algorithms_convex_optimization
Optimization and prediction of a geofoam-filled trench in homogeneous and lay...
Homogenization Of Coupled Phenomena In Heterogenous Media Jeanlouis Auriault
Ad

Recently uploaded (20)

PPTX
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
PDF
RMMM.pdf make it easy to upload and study
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
Lesson notes of climatology university.
PPTX
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PDF
advance database management system book.pdf
PDF
Empowerment Technology for Senior High School Guide
PDF
Indian roads congress 037 - 2012 Flexible pavement
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
IGGE1 Understanding the Self1234567891011
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
RMMM.pdf make it easy to upload and study
Supply Chain Operations Speaking Notes -ICLT Program
Final Presentation General Medicine 03-08-2024.pptx
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Hazard Identification & Risk Assessment .pdf
Lesson notes of climatology university.
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
Chinmaya Tiranga quiz Grand Finale.pdf
Practical Manual AGRO-233 Principles and Practices of Natural Farming
A powerpoint presentation on the Revised K-10 Science Shaping Paper
UNIT III MENTAL HEALTH NURSING ASSESSMENT
advance database management system book.pdf
Empowerment Technology for Senior High School Guide
Indian roads congress 037 - 2012 Flexible pavement
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
IGGE1 Understanding the Self1234567891011
Ad

Nonparametric Tests For Complete Data Vilijandas Bagdonavicius

  • 1. Nonparametric Tests For Complete Data Vilijandas Bagdonavicius download https://ebookbell.com/product/nonparametric-tests-for-complete- data-vilijandas-bagdonavicius-4309000 Explore and download more ebooks at ebookbell.com
  • 2. Here are some recommended products that we believe you will be interested in. You can click the link to download. Nonparametric Tests For Censored Data Vilijandas Bagdonavicius https://ebookbell.com/product/nonparametric-tests-for-censored-data- vilijandas-bagdonavicius-4308998 Nonparametric Tuning Of Pid Controllers A Modified Relayfeedbacktest Approach 1st Edition Igor Boiko Auth https://ebookbell.com/product/nonparametric-tuning-of-pid-controllers- a-modified-relayfeedbacktest-approach-1st-edition-igor-boiko- auth-4324482 Theory Of Nonparametric Tests 1st Ed Thorsten Dickhaus https://ebookbell.com/product/theory-of-nonparametric-tests-1st-ed- thorsten-dickhaus-7149086 Nonparametric Statistical Tests A Computational Approach Markus Neuhauser https://ebookbell.com/product/nonparametric-statistical-tests-a- computational-approach-markus-neuhauser-4393802
  • 3. Nonparametric Monte Carlo Tests And Their Applications 1st Edition Lixing Zhu Auth https://ebookbell.com/product/nonparametric-monte-carlo-tests-and- their-applications-1st-edition-lixing-zhu-auth-1291718 Statistical Tests Of Nonparametric Hypotheses Asymptotic Theory 1st Edition Odile Pons https://ebookbell.com/product/statistical-tests-of-nonparametric- hypotheses-asymptotic-theory-1st-edition-odile-pons-5137748 Introduction To Statistics The Nonparametric Way Springer Texts In Statistics Softcover Reprint Of The Original 1st Ed 1991 Noether https://ebookbell.com/product/introduction-to-statistics-the- nonparametric-way-springer-texts-in-statistics-softcover-reprint-of- the-original-1st-ed-1991-noether-55472522 An Introduction To Nonparametric Statistics Chapman Hallcrc Texts In Statistical Science 1st Edition John E Kolassa https://ebookbell.com/product/an-introduction-to-nonparametric- statistics-chapman-hallcrc-texts-in-statistical-science-1st-edition- john-e-kolassa-51992562 Nonparametric Statistics For Applied Linguistics Research 1st Edition Hassan Soleimani https://ebookbell.com/product/nonparametric-statistics-for-applied- linguistics-research-1st-edition-hassan-soleimani-33392562
  • 5. Non-parametric Tests for Complete Data
  • 6. Non-parametric Tests for Complete Data Vilijandas Bagdonavičius Julius Kruopis Mikhail S. Nikulin
  • 7. First published 2011 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd John Wiley & Sons, Inc. 27-37 St George’s Road 111 River Street London SW19 4EU Hoboken, NJ 07030 UK USA www.iste.co.uk www.wiley.com © ISTE Ltd 2011 The rights of Vilijandas Bagdonaviçius, Julius Kruopis and Mikhail S. Nikulin to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Cataloging-in-Publication Data Bagdonavicius, V. (Vilijandas) Nonparametric tests for complete data / Vilijandas Bagdonavicius, Julius Kruopis, Mikhail Nikulin. p. cm. Includes bibliographical references and index. ISBN 978-1-84821-269-5 (hardback) 1. Nonparametric statistics. 2. Statistical hypothesis testing. I. Kruopis, Julius. II. Nikulin, Mikhail (Mikhail S.) III. Title. QA278.8.B34 2010 519.5--dc22 2010038271 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN 978-1-84821-269-5 Printed and bound in Great Britain by CPI Antony Rowe, Chippenham and Eastbourne.
  • 8. Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Terms and Notation . . . . . . . . . . . . . . . . . . . . . xv Chapter 1. Introduction . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . 1 1.2. Examples of hypotheses in non-parametric models . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1. Hypotheses on the probability distribution of data elements . . . . . . . . . . . . . . . . . 2 1.2.2. Independence hypotheses . . . . . . . . . . . 4 1.2.3. Randomness hypothesis . . . . . . . . . . . . 4 1.2.4. Homogeneity hypotheses . . . . . . . . . . . . 4 1.2.5. Median value hypotheses . . . . . . . . . . . 5 1.3. Statistical tests . . . . . . . . . . . . . . . . . . . . 5 1.4. P-value . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5. Continuity correction . . . . . . . . . . . . . . . . 10 1.6. Asymptotic relative efficiency . . . . . . . . . . . 13 Chapter 2. Chi-squared Tests . . . . . . . . . . . . . . 17 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 17 2.2. Pearson’s goodness-of-fit test: simple hypothesis 17 1.1. Statistical hypotheses .
  • 9. vi Non-parametric Tests for Complete Data 2.3. Pearson’s goodness-of-fit test: composite hypothesis . . . . . . . . . . . . . . . . . . . . . . . 26 2.4. Modified chi-squared test for composite hypotheses . . . . . . . . . . . . . . . . . . . . . . . 34 2.4.1. General case . . . . . . . . . . . . . . . . . . . . 35 2.4.2. Goodness-of-fit for exponential distributions 41 2.4.3. Goodness-of-fit for location-scale and shape-scale families . . . . . . . . . . . . . . . 43 2.5. Chi-squared test for independence . . . . . . . . 52 2.6. Chi-squared test for homogeneity . . . . . . . . . 57 2.7. Bibliographic notes . . . . . . . . . . . . . . . . . . 64 2.8. Exercises . . . . . . . . . . . . . . . . . . . . . . . . 64 2.9. Answers . . . . . . . . . . . . . . . . . . . . . . . . . 72 Chapter 3. Goodness-of-fit Tests Based on Empirical Processes . . . . . . . . . . . . . . . . . . . . . 77 3.1. Test statistics based on the empirical process . 77 3.2. Kolmogorov–Smirnov test . . . . . . . . . . . . . 82 3.3. ω2, Cramér–von-Mises and Andersen–Darling tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.4. Modifications of Kolmogorov–Smirnov, Cramér–von-Mises and Andersen–Darling tests: composite hypotheses . . . . . . . . . . . . 91 3.5. Two-sample tests . . . . . . . . . . . . . . . . . . . 98 3.5.1. Two-sample Kolmogorov–Smirnov tests . . . 98 3.5.2. Two-sample Cramér–von-Mises test . . . . . 103 3.6. Bibliographic notes 104 3.7. Exercises . . . . . . . . . . . . . . . . . . . . . . . . 106 3.8. Answers 109 Chapter 4. Rank Tests . . . . . . . . . . . . . . . . . . . 111 4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 111 4.2. Ranks and their properties 112 4.3. Rank tests for independence . . . . . . . . . . . . 117 4.3.1. Spearman’s independence test . . . . . . . . . 117 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
  • 10. Table of Contents vii 4.3.2. Kendall’s independence test . . . . . . . . . . 124 4.3.3. ARE of Kendall’s independence test with respect to Pearson’s independence test under normal distribution . . . . . . . . . . . 133 4.3.4. Normal scores independence test . . . . . . . 137 4.4. Randomness tests . . . . . . . . . . . . . . . . . . 139 4.4.1. Kendall’s and Spearman’s randomness tests 140 4.4.2. Bartels–Von Neuman randomness test . . . 143 4.5. Rank homogeneity tests for two independent samples . . . . . . . . . . . . . . . . . . . . . . . . 146 4.5.1. Wilcoxon (Mann–Whitney–Wilcoxon) rank sum test. . . . . . . . . . . . . . . . . . . . . . . 146 4.5.2. Power of the Wilcoxon rank sum test against location alternatives . . . . . . . . . . 153 4.5.3. ARE of the Wilcoxon rank sum test with respect to the asymptotic Student’s test . . 155 4.5.4. Van der Warden’s test . . . . . . . . . . . . . . 161 4.5.5. Rank homogeneity tests for two independent samples under a scale alternative . . . . . . . . . . . . . . . . . . . . . 163 4.6. Hypothesis on median value: the Wilcoxon signed ranks test . . . . . . . . . . . . . . . . . . . 168 4.6.1. Wilcoxon’s signed ranks tests . . . . . . . . . 168 4.6.2. ARE of the Wilcoxon signed ranks test with respect to Student’s test . . . . . . . . . . . . . 177 4.7. Wilcoxon’s signed ranks test for homogeneity of two related samples . . . . . . . . . . . . . . . . . 180 4.8. Test for homogeneity of several independent samples: Kruskal–Wallis test . . . . . . . . . . . 181 4.9. Homogeneity hypotheses for k related samples: Friedman test . . . . . . . . . . . . . . . . . . . . . 191 4.10. Independence test based on Kendall’s concordance coefficient . . . . . . . . . . . . . . . 204 4.11. Bibliographic notes . . . . . . . . . . . . . . . . . . 208 4.12. Exercises . . . . . . . . . . . . . . . . . . . . . . . . 209 4.13. Answers . . . . . . . . . . . . . . . . . . . . . . . . . 212
  • 11. viii Non-parametric Tests for Complete Data Chapter 5. Other Non-parametric Tests . . . . . . . 215 5.1. Sign test . . . . . . . . . . . . . . . . . . . . . . . . 215 5.1.1. Introduction: parametric sign test . . . . . . 215 5.1.2. Hypothesis on the nullity of the medians of the differences of random vector components 218 5.1.3. Hypothesis on the median value . . . . . . . 220 5.2. Runs test . . . . . . . . . . . . . . . . . . . . . . . . 221 5.2.1. Runs test for randomness of a sequence of two opposite events . . . . . . . . . . . . . . . 223 5.2.2. Runs test for randomness of a sample . . . . 226 5.2.3. Wald–Wolfowitz test for homogeneity of two independent samples . . . . . . . . . . . . . . 228 5.3. McNemar’s test . . . . . . . . . . . . . . . . . . . . 231 5.4. Cochran test . . . . . . . . . . . . . . . . . . . . . . 238 5.5. Special goodness-of-fit tests . . . . . . . . . . . . 245 5.5.1. Normal distribution . . . . . . . . . . . . . . . 245 5.5.2. Exponential distribution . . . . . . . . . . . . 253 5.5.3. Weibull distribution . . . . . . . . . . . . . . . 260 5.5.4. Poisson distribution . . . . . . . . . . . . . . . 262 5.6. Bibliographic notes 268 5.7. Exercises . . . . . . . . . . . . . . . . . . . . . . . . 269 5.8. Answers . . . . . . . . . . . . . . . . . . . . . . . . . 271 APPENDICES 275 Appendix A. Parametric Maximum Likelihood . . . . . . . . . . . 277 Appendix B. Notions from the Theory of . . . . . . . . . . . . . . . . . . 281 B.1. Stochastic process . . . . . . . . . . . . . . . . . . 281 B.2. Examples of stochastic processes . . . . . . . . 282 B.2.1. Empirical process . . . . . . . . . . . . . . . 282 B.2.2. Gauss process . . . . . . . . . . . . . . . . . 283 B.2.3. Wiener process (Brownian motion) . . . . . 283 B.2.4. Brownian bridge . . . . . . . . . . . . . . . . 284 B.3. Weak convergence of stochastic processes . . . 285 . . . . . . . . . . . . . . . . . . . . . . . . . . . Estimators: Stochastic Processes Complete Samples . . . . . . . . . . . . . . . . . .
  • 12. Table of Contents ix B.4. Weak invariance of empirical processes . . . . 286 B.5. Properties of Brownian motion and Brownian bridge . . . . . . . . . . . . . . . . . . . . . . . . . 287 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . 293 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
  • 13. Preface Testing hypotheses in non-parametric models are discussed in this book. A statistical model is non-parametric if it cannot be written in terms of a finite-dimensional parameter. The main hypotheses tested in such models are hypotheses on the probability distribution of elements of the following: data homogeneity, randomness and independence hypotheses. Tests for such hypotheses from complete samples are considered in many books on non-parametric statistics, including recent monographs by Maritz [MAR 95], Hollander and Wolfe [HOL 99], Sprent and Smeeton [SPR 01], Govindarajulu [GOV 07], Gibbons and Chakraborti [GIB 09] and Corder and Foreman [COR 09]. This book contains tests from complete samples. Tests for censored samples can be found in our book Tests for Censored Samples [BAG 1 ]. In Chapter 1, the basic ideas of hypothesis testing and general hypotheses on non-parametric models are briefly described. In the initial phase of the solution of any statistical problem the analyst must choose a model for data analysis. The correctness of the data analysis strongly depends on the choice 1
  • 14. xii Non-parametric Tests for Complete Data of an appropriate model. Goodness-of-fit tests are used to check the adequacy of a model for real data. One of the most-applied goodness-of-fit tests are chi- squared type tests, which use grouped data. In many books on statistical data analysis, chi-squared tests are applied incorrectly. Classical chi-squared tests are based on theoretical results which are obtained assuming that the ends of grouping intervals do not depend on the sample, and the parameters are estimated using grouped data. In real applications, these assumptions are often forgotten. The modified chi-squared tests considered in Chapter 2 do not suffer from such drawbacks. They are based on the assumption that the ends of grouping intervals depend on the data, and the parameters are estimated using initially non- grouped data. Another class of goodness-of-fit tests based on functionals of the difference of empirical and theoretical cumulative distribution functions is described in Chapter 3. The tests for composite hypotheses classical statistics are modified by replacing unknown parameters by their estimators. Application of these tests is often incorrect because the critical values of the classical tests are used in testing the composite hypothesis and applying modified statistics. In section 5.5, special goodness-of-fit tests which are not from the two above-mentioned classes, and which are specially designed for specified probability distributions, are given. Tests for the equality of probability distributions (homogeneity tests) of two or more independent or dependent random variables are considered in several chapters. Chi- squared type tests are given in section 2.5 and tests based on functionals of the difference of empirical distribution functions are given in section 3.5. For many alternatives, the
  • 15. Preface xiii most efficient tests are the rank tests for homogeneity given in sections 4.4 and 4.6–4.8. Classical tests for the independence of random variables are given in sections 2.4 (tests of chi-square type), and 4.3 and 5.2 (rank tests). Tests for data randomness are given in sections 4.3 and 5.2. All tests are described in the following way: 1) a hypothesis is formulated; 2) the idea of test construction is given; 3) a statistic on which a test is based is given; 4) a finite sample and (or) asymptotic distribution of the test statistic is found; 5) a test, and often its modifications (continuity correction, data with ex aequo, various approximations of asymptotic law) are given; 6) practical examples of application of the tests are given; and 7) at the end of the chapters problems with answers are given. Anyone who uses non-parametric methods of mathematical statistics, or wants to know the ideas behind and mathematical substantiation of the tests, can use this book. It can be used as a textbook for a one-semester course on non-parametric hypotheses testing. Knowledge of probability and parametric statistics are needed to follow the mathematical developments. The basic facts on probability and parametric statistics used in the the book are also given in the appendices. The book consists of five chapters, and appendices. In each chapter, the numbering of theorems, formulas, and comments are given using the chapter number. The book was written using lecture notes for graduate students in Vilnius and Bordeaux universities.
  • 16. xiv Non-parametric Tests for Complete Data We thank our colleagues and students at Vilnius and Bordeaux universities for comments on the content of this book, especially Rūta Levulienė for writing the computer programs needed for application of the tests and solutions of all the exercises. Vilijandas BAGDONAVIČIUS Julius KRUOPIS Mikhail NIKULIN
  • 17. Terms and Notation ||A|| – the norm ( P i P j a2 ij)1/2 of a matrix A = [aij]; A > B (A ≥ B) – the matrix A − B is positive (non- negative) definite; a ∨ b (a ∧ b) – the maximum (the minimum) of the numbers a and b; ASE – the asymptotic relative efficiency; B(n, p) – binomial distribution with parameters n and p; B−(n, p) – negative binomial distribution with parameters n and p; Be(γ, η) – beta distribution with parameters γ and η; cdf – the cumulative distribution function; CLT – the central limit theorem; Cov(X, Y ) – the covariance of random variables X and Y ; Cov(X, Y ) – the covariance matrix of random vectors X and Y ;
  • 18. xvi Non-parametric Tests for Complete Data EX – the mean of a random variable X; E(X) – the mean of a random vector X; Eθ(X), E(X|θ), Varθ(X), Var(X|θ) – the mean or the variance of a random variable X depending on the parameter θ; E(λ) – exponential distribution with parameters λ; F(m, n) – Fisher distribution with m and n degrees of freedom; F(m, n; δ) – non-central Fisher distribution with m and n degrees of freedom and non-centrality parameter δ; Fα(m, n) – α critical value of Fisher distribution with m and n degrees of freedom; FT (x) (fT (x)) – the cdf (the pdf) of the random variable T; f(x; θ), f(x|θ) – the pdf depending on a parameter θ; F(x; θ), F(x|θ) – the cdf depending on a parameter θ; G(λ, η) – gamma distribution with parameters λ and η; iid – independent identically distributed; LN(µ, σ) – lognormal distribution with parameters µ and σ; LS – least-squares (method, estimator); ML – maximum likelihood (function, method, estimator); N(0, 1) – standard normal distribution; N(µ, σ2) – normal distribution with parameters µ and σ2;
  • 19. Terms and Notation xvii Nk(µ, Σ) – k-dimensional normal distribution with the mean vector µ and the covariance matrix Σ; P(λ) – Poisson distribution with a parameter λ; pdf – the probability density function; P{A} – the probability of an event A; P{A|B} – the conditional probability of event A; Pθ{A}, P{A|θ} – the probability depending on a parameter θ; Pk(n, π) – k-dimensional multinomial distribution with parameters n and π = (π1, ..., πk)T , π1 + ... + πk = 1; rv – random variable S(n) – Student’s distribution with n degrees of freedom; S(n; δ) – non-central Student’s distribution with n degrees of freedom and non-centrality parameter δ; tα(n) – α critical value of Student’s distribution with n degrees of freedom; U(α, β) – uniform distribution in the interval (α, β); UMP – uniformly most powerful (test); UUMP – unbiased uniformly most powerful (test); VarX – the variance of a random variable X; Var(X) – the covariance matrix of a random vector X; W(θ, ν) – Weibull distribution with parameters θ ir ν; X, Y, Z, ... – random variables;
  • 20. xviii Non-parametric Tests for Complete Data X, Y , Z, ... – random vectors; XT – the transposed vector, i.e. a vector-line; ||x|| – the length (xT x)1/2 = ( P i x2 i )1/2 of a vector x = (x1, ..., xk)T ; X ∼ N(µ, σ2) – random variable X normally distributed with parameters µ and σ2 (analogously in the case of other distributions); Xn P → X – convergence in probability (n → ∞); Xn a.s. → X – almost sure convergence or convergence with probability 1 (n → ∞); Xn d → X, Fn(x) d → F(x) – weak convergence or convergence in distribution (n → ∞); Xn d → X ∼ N(µ, σ2) – random variables Xn asymptotically (n → ∞) normally distributed with parameters µ and σ2; Xn ∼ Yn – random variables Xn and Yn asymptotically (n → ∞) equivalent (Xn − Yn P → 0); x(P) – P-th quantile; xP – P-th critical value; zα – α critical value of the standard normal distribution; Σ = [σij]k×k – covariance matrix; χ2(n) – chi-squared distribution with n degrees of freedom; χ2(n; δ) – non-central chi-squared distribution with n degrees of freedom and non-centrality parameter δ; χ2 α(n) – α critical value of chi-squared distribution with n degrees of freedom.
  • 21. Chapter 1 Introduction 1.1. Statistical hypotheses The simplest model of statistical data is a simple sample, i.e. a vector X = (X1, ..., Xn)T of n independent identically distributed random variables. In real experiments the values xi of the random variables Xi are observed (measured). The non-random vector x = (x1, ..., xn)T is a realization of the simple sample X. In more complicated experiments the elements Xi are dependent, or not identically distributed, or are themselves random vectors. The random vector X is then called a sample, not a simple sample. Suppose that the cumulative distribution function (cdf) F of a sample X (or of any element Xi of a simple sample) belongs to a set F of cumulative distribution functions. For example, if the sample is simple then F may be the set of absolutely continuous, discrete, symmetric, normal, Poisson cumulative distribution functions. The set F defines a statistical model. Suppose that F0 is a subset of F.
  • 22. 2 Non-parametric Tests for Complete Data The statistical hypothesis H0 is the following assertion: the cumulative distribution function F belongs to the set F0. We write H0 : F ∈ F0. The hypothesis H1 : F ∈ F1, where F1 = FF0 is the complement of F0 to F is called alternative to the hypothesis H0. If F = {Fθ, θ ∈ Θ ⊂ Rm} is defined by a finite-dimensional parameter θ then the model is parametric. In this case the statistical hypothesis is a statement on the values of the finite- dimensional parameter θ. In this book non-parametric models are considered. A statistical model F is called non-parametric if F is not defined by a finite-dimensional parameter. If the set F0 contains only one element of the set F then the hypothesis is simple, otherwise the hypothesis is composite. 1.2. Examples of hypotheses in non-parametric models Let us look briefly and informally at examples of the hypotheses which will be considered in the book. We do not formulate concrete alternatives, only suppose that models are non-parametric. Concrete alternatives will be formulated in the chapters on specified hypotheses. 1.2.1. Hypotheses on the probability distribution of data elements The first class of hypotheses considered in this book consists of hypotheses on the form of the cdf F of the elements of a sample. Such hypotheses may be simple or composite.
  • 23. Introduction 3 A simple hypothesis has the form H0 : F = F0; here F0 is a specified cdf. For example, such a hypothesis may mean that the n numbers generated by a computer are realizations of random variables having uniform U(0, 1), Poisson P(2), normal N(0, 1) or other distributions. A composite hypothesis has the form H0 : F ∈ F0 = {Fθ, θ ∈ Θ}, where Fθ are cdfs of known analytical form depending on the finite-dimensional parameter θ ∈ Θ. For example, this may mean that the salary of the doctors in a city are normally distributed, or the failure times of TV sets produced by a factory have the Weibull distribution. More general composite hypotheses, meaning that the data verify some parametric or semi-parametric regression model, may be considered. For example, in investigating the influence of some factor z on the survival time the following hypothesis on the cdf Fi of the i-th sample element may be used: Fi(x) = 1 − {1 − F0(x)}exp{βzi} , i = 1, . . . , n where F0 is an unknown baseline cdf, β is an unknown scalar parameter and zi is a known value of the factor for the i-th sample element. The following tests for simple hypotheses are considered: chi-squared tests (section 2.2) and tests based on the difference of empirical and cumulative distribution functions (sections 3.2 and 3.3). The following tests for composite hypotheses are considered: general tests such as chi-squared tests (sections 2.3 and 2.4, tests based on the difference of non-parametric and parametric estimators of the cumulative distribution function (section 3.4), and also special tests for specified families of probability distributions (section 5.5).
  • 24. 4 Non-parametric Tests for Complete Data 1.2.2. Independence hypotheses Suppose that (Xi, Yi)T , i = 1, 2...., n is a simple sample of the random vector (X, Y )T with the cdf F = F(x, y) ∈ F; here F is a non-parametric class two-dimensional cdf. An independence hypothesis means that the components X and Y are independent. For example, this hypothesis may mean that the sum of sales of managers X and the number of complaints from consumers Y are independent random variables. The following tests for independence of random variables are considered: chi-squared independence tests (section 2.5) and rank tests (sections 4.3 and 4.10). 1.2.3. Randomness hypothesis A randomness hypothesis means that the observed vector x = (x1, ..., xn)T is a realization of a simple sample X = (X1, ..., Xn)T , i.e. of a random vector with independent and identically distributed (iid) components. The following tests for randomness hypotheses are considered: runs tests (section 5.2) and rank tests (section 4.4). 1.2.4. Homogeneity hypotheses A homogeneity hypothesis of two independent simple samples X = (X1, ..., Xm)T and Y = (Y1, ..., Yn)T means that the cdfs F1 and F2 of the random variables Xi and Yj coincide. The homogeneity hypothesis of k > 2 independent samples is formulated analogously. The following tests for homogeneity of independent simple samples are considered: chi-squared tests (section 2.6), tests
  • 25. Introduction 5 based on the difference of cumulative distribution functions (section 3.5), rank tests (sections 4.5 and 4.8), and some special tests (section 5.1). If n independent random vectors Xi = (Xi1, ..., Xik)T , i = 1, ..., n are observed then the vectors (X1j, ..., Xnj)T composed of the components are k dependent samples, j = 1, ..., k. The homogeneity hypotheses of k related samples means the equality of the cdfs F1, ..., Fk of the components Xi1, ..., Xik. The following tests for homogeneity of related samples are considered: rank tests (sections 4.7 and 4.9) and other special tests (sections 5.1, 5.3 and 5.4). 1.2.5. Median value hypotheses Suppose that X = (X1, ..., Xn)T is a simple sample of a continuous random variable X. Denote by M the median of the random variable X. The median value hypothesis has the form H : M = M0; here M0 is a specified value of the median. The following tests for this hypothesis are considered: sign tests (section 5.1) and rank tests (section 4.6). 1.3. Statistical tests A statistical test or simply a test is a rule which enables a decision to be made on whether or not the zero hypothesis H0 should be rejected on the basis of the observed realization of the sample. Any test considered in this book is based on the values of some statistic T = T(X) = T(X1, ..., Xn), called the test statistic. Usually the statistic T takes different values under the hypothesis H0 and the alternative H1. If the statistic T has a tendency to take smaller (greater) values under
  • 26. 6 Non-parametric Tests for Complete Data the hypothesis H0 than under the alternative H1 then the hypothesis H0 is rejected in favor of the alternative if T > c (T < c, respectively), where c is a well-chosen real number. If the values of the statistic T have a tendency to concentrate in some interval under the hypothesis and outside this interval under the alternative then the hypothesis H0 is rejected in favor of the alternative if T < c1 or T > c2, where c1 and c2 are well-chosen real numbers. Suppose that the hypothesis H0 is rejected if T > c (the other two cases are considered similarly). The probability β(F) = PF {T > c} of rejecting the hypothesis H0 when the true cumulative distribution function is a specified function F ∈ F is called the power function of the test. When using a test, two types of error are possible: 1. The hypothesis H0 is rejected when it is true, i.e. when F ∈ F0. Such an error is called a type I error. The probability of this error is β(F), F ∈ F0. 2. The hypothesis H0 is not rejected when it is false, i.e. when F ∈ F1. Such an error is called a type II error. The probability of this error is 1 − β(F), F ∈ F1. The number supF∈F0 β(F) [1.1] is called the significance level of the test . Fix α ∈ (0, 1). If the significance level does not exceed α then for any F ∈ F0 the type I error does not exceed α.
  • 27. Introduction 7 Usually tests with significance level values not greater than α = 0.1; 0.05; 0.01 are used. If the distribution of the statistic T is absolutely continuous then, usually, for any α ∈ (0, 1) we can find a test based on this statistic such that the significance level is equal to α. A test with a significance level not greater than α is called unbiased, if inf F∈F1 β(F) ≥ α [1.2] This means that the zero hypothesis is rejected with greater probability under any specified alternative than under the zero hypothesis. Let T be a class of test statistics of unbiased tests with a significance level not greater than α. The statistic T defines the uniformly most powerful unbiased test in the class T if βT (F) ≥ βT∗ (F) for all T∗ ∈ T and for all F ∈ F1. A test is called consistent if for all F ∈ F1 β(F) → 1, as n → ∞ [1.3] This means that if n is large then under any specified alternative the probability of rejecting the zero hypothesis is near to 1. 1.4. P-value Suppose that a simple statistical hypothesis H0 is rejected using tests of one of the following forms: 1) T ≥ c; 2) T ≤ c; or 3) T ≤ c1 or T ≥ c2; here T = T(X) is a test statistic based on the sample X = (X1, . . . , Xn)T . We write P0{A} = P{A|H0}.
  • 28. 8 Non-parametric Tests for Complete Data Fix α ∈ (0, 1). The first (second) test has a significance level not greater than α, and nearest to α if the constant c = inf{s : P0{T ≥ s} ≤ α} (c = sup{s : P0{T ≤ s} ≤ α}). The third test has a significance level not greater than α if c1 = sup{s : P{T ≤ s} ≤ α/2} and c2 = inf{s : P{T ≥ s} ≤ α/2}. Denote by t the observed value of the statistic T. In the case of the tests of the first two forms the P-values are defined as the probabilities pv = P0{T ≥ t} and pv = P0{T ≤ t}. Thus the P-value is the probability that under the zero hypothesis H0 the statistic T takes a value more distant than t in the direction of the alternative (in the first case to the right, in the second case to the left, from t). In the third case, if P0{T ≤ t} ≤ P0{T ≥ t} then pv/2 = P0{T ≤ t} and if P0{T ≤ t} ≥ P0{T ≥ t} then pv/2 = P0{T ≥ t} So in the third case the P-value is defined as follows pv = 2 min{P0{T ≤ t}, P0{T ≥ t}} = 2 min{FT (t), 1 − FT (t−)} where FT is the cdf of the statistic T under the zero hypothesis H0. If the distribution of T is absolutely continuous and symmetric with respect to the origin, the last formula implies pv = 2 min{FT (t), FT (−t)} = 2FT (− | t |) = 2{1 − FT (| t |)}
  • 29. Introduction 9 If the result observed during the experiment is a rare event when the zero hypothesis is true then the P-value is small and the hypothesis should be rejected. This is confirmed by the following theorem. Theorem 1.1. Suppose that the test is of any of the three forms considered above. For the experiment with the value t of the statistic T the inequality pv ≤ α is equivalent to the rejection of the zero hypothesis. Proof. Let us consider an experiment where T = t. If the test is defined by the inequality T ≥ c (T ≤ c) then c = inf{s : P0{T ≥ s} ≤ α} (c = sup{s : P0{T ≤ s} ≤ α}) and P0{T ≤ t} = pv (P0{T ≥ t} = pv). So the inequality pv ≤ α is equivalent to the inequality t ≥ c (t ≤ c). The last inequalities mean that the hypothesis is rejected. If the test is defined by the inequalities T ≤ c1 or T ≥ c2 then c1 = sup{s : P0{T ≤ s} ≤ α/2}, c2 = inf{s : P0{T ≥ s} ≤ α/2} and 2 min{P0{T ≤ t}, P0{T ≥ t}} = pv. So the inequality pv ≤ α means that 2 min{P0{T ≤ t, P0{T ≥ t}} ≤ α. If P0{T ≤ t} ≥ P0{T ≥ t}, then the inequality pv ≤ α means that P0{T ≥ t} ≤ α/2. This is equivalent to the inequality t ≥ c2, which means that the hypothesis is rejected. Analogously, if P0{T ≤ t} ≥ P0{T ≥ t} then the inequality pv ≤ α means that P0{T ≤ t} ≤ α/2. This is equivalent to the inequality t ≤ c1, which means that the hypothesis is rejected. So in both cases the inequality pv ≤ α means that the hypothesis is rejected. △ If the critical region is defined by the asymptotic distribution of T (usually normal or chi-squared) then the P- value pva is computed using the asymptotic distribution of T, and it is called the asymptotic P-value.
  • 30. 10 Non-parametric Tests for Complete Data Sometimes the P-value pv is interpreted as random because each value t of T defines a specific value of pv. In the case of the alternatives considered above the P-values are the realizations of the following random variables: 1 − FT (T−), FT (T) and 2 min{FT (T), 1 − FT (T−)} 1.5. Continuity correction If the distribution of the statistic T is discrete and the asymptotic distribution of this statistic is absolutely continuous (usually normal) then for medium-sized samples the approximation of distribution T can be improved using the continuity correction [YAT 34]. The idea of a continuity correction is explained by the following example. Example 1.1. Let us consider the parametric hypothesis: H : p = 0.5 and the alternative H1 : p > 0.5; here p is the Bernoulli distribution parameter. For example, suppose that during n = 20 Bernoulli trials the number of successes is T = 13. It is evident that the hypothesis H0 is rejected if the statistic T takes large values, i.e. if T ≥ c for a given c. Under H0, the statistic T has the binomial distribution B(20; 0.5). The exact P-value is pv = P{T ≥ 13} = 20 X i=13 Ci 20(1/2)20 = I1/2(13, 8) = 0.131588 Using the normal approximation Zn = (T − 0.5n)/ √ 0.25n = (T − 10)/ √ 5 d → Z ∼ N(0, 1) we obtain the asymptotic P-value pva = P{T ≥ 13} = P{ T − 10 √ 5 ≥ 13 − 10 √ 5 } ≈
  • 31. Introduction 11 1 − Φ( 13 − 10 √ 5 ) = 0.089856 This is considerably smaller than the exact P-value. Note that P{T ≥ 13} = P{T > 12}. If we use the normal approximation then the same probability may be approximated by: 1 − Φ((13 − 10)/ √ 5) = 0.089856 or by 1 − Φ((12 − 10)/ √ 5) = 0.185547 Both approximations are far from the exact P-value. The continuity correction is therefore performed using the normal approximation in the center 12.5 of the interval (12, 13]. So the asymptotic P-value with a continuity correction is pvcc = 1 − Φ((13 − 0.5 − 10)/ √ 5) = 0.131776 The obtained value is very similar to the exact P-value. In the case of the alternative H2 : p < 0.5 the zero hypothesis is rejected if T ≤ d, and the P-value is pv = P{T ≤ 13} = 13 X i=0 Ci 20(1/2)2 0 = I1/2(7, 14) = 0.942341 In this case pva = Φ((13 − 10)/ √ 5) = 0.910144 Note that P{T ≤ 13} = P{T < 14}. If we use the normal approximation, the same probability is approximated by Φ((13 − 10)/ √ 5) = 0.910144 or Φ((14 − 10)/ √ 5) = 0.963181 Both are far from the exact P-value. So the continuity correction is performed using the normal approximation in the
  • 32. 12 Non-parametric Tests for Complete Data middle 13.5 of the interval (13, 14]. Therefore the asymptotic P-value with a continuity correction is pvcc = Φ((13 + 0.5 − 10)/ √ 5) = 0.941238 The obtained value is very similar to the exact P-value. In the case of the bilateral alternative H3 : p 6= 0.5 the exact and asymptotic P-values are pv = 2 min{FT (13), 1 − FT (13−)} = 2 min(0.942341; 0.131588) = 0.263176 and pva = 2 min(0.910144; 0.089856) = 0.179712, respectively. The asymptotic P-value with a continuity correction is pvcc = 2 min(Φ((13 + 0.5 − 10)/ √ 5), 1 − Φ((13 − 0.5 − 10)/ √ 5)) = 2 min(0.941238, 0.131776) = 0.263452 Generalizing, suppose that the test statistic T takes integer values and under the zero hypothesis the asymptotic distribution of the statistic Z = T − ET √ VarT is standard normal. If the critical region is defined by the inequalities a) T ≥ c; b) T ≤ c; c) T ≤ c1 or T ≥ c2; and the observed value of the statistic T is t then the asymptotic P-values with a continuity correction are pvcc = 1 − Φ((t − 0.5 − ET)/ √ VarT) pvcc = Φ((t + 0.5 − ET)/ √ VarT) pvcc = 2 min Φ t + 0.5 − ET √ VarT , 1 − Φ t − 0.5 − ET √ VarT [1.4] respectively.
  • 33. Introduction 13 1.6. Asymptotic relative efficiency Suppose that under the zero hypothesis and under the alternative the distribution of the data belongs to a non- parametric family depending on a scalar parameter θ and possibly another parameter ϑ. Let us consider the hypothesis H0 : θ = θ0 with the one-sided alternatives H1 : θ θ0 or H2 : θ θ0 and the two-sided alternative H3 : θ 6= θ0. Example 1.2. Let X = (X1, ..., Xn)T and Y = (Y1, ..., Ym)T be two independent simple samples, Xi ∼ F(x) and Yj ∼ F(x − θ), where F(x) is an unknown absolutely continuous cdf (the parameter ϑ) and θ is a location parameter. Under the homogeneity hypothesis θ = 0, under the one-sided alternatives θ 0 or θ 0, and under the two-sided alternative θ 6= 0. So the homogeneity hypothesis can be formulated in terms of the scalar parameter: H0 : θ = 0. The possible alternatives are H1 : θ 0, H2 : θ 0, H3 : θ 6= 0. Let us consider the one-sided alternative H1. Fix α ∈ (0, 1). Suppose that the hypothesis is rejected if Tn cn,α where n is the sample size and Tn is the test statistic. Denote by βn(θ) = Pθ(Tn cn,α} the power function of the test. Most of the tests are consistent, so the power of such tests is close to unity if the sample size is large. So the limit of the power under fixed alternatives is not suitable for comparing the performance of different tests.
  • 34. 14 Non-parametric Tests for Complete Data To compare the tests the behavior of the powers of these tests under the sequence of approaching alternatives Hn : θ = θn = θ0 + h nδ , δ 0, h 0, may be compared. Suppose that the same sequence of approaching alternatives is written in two ways θm = θ0 + h1 nδ 1m = θ0 + h2 nδ 2m where nim → ∞ as m → ∞, and lim m→∞ βn1m (θm) = lim m→∞ βn2m (θm) Then the limit (if it exists and does not depend on the choice of θm) e(T1n, T2n) = lim m→∞ n2m n1m is called the asymptotic relative efficiency (ARE) [PIT 48] of the first test with respect to the second test. The ARE is the inverse ratio of sample sizes necessary to obtain the same power for two tests with the same asymptotic significance level, while simultaneously the sample sizes approach infinity and the sequence of alternatives approaches θ0. Under regularity conditions, the ARE have a simple expression. Regularity assumptions: 1) Pθ0 {Tin ≥ cn,α} → α.
  • 35. Introduction 15 2) In the neighborhood of θ0 there exist µin(θ) = EθTin, σin(θ) = VarθTin and the function µin(θ) is infinitely differentiable at the point θ0; furthermore, µ̇in(θ0) 0, and higher order derivatives are equal to 0; i = 1, 2. 3) There exists lim n→∞ µin(θ) = µi(θ), lim n→∞ nδ σin(θ) = σi(θ), µi(θ0)/σi(θ0) 0 where δ 0. 4) For any h ≥ 0 µ̇in(θn) → µ̇i(θ0), σin(θn) → σi(θ0), as n → ∞ 5) Test statistics are asymptotically normal: Pθn {(Tin − µin(θn))/σin(θn) ≤ z} → Φ(z). Theorem 1.2. If the regularity assumptions are satisfied then the asymptotic relative efficiency can be written in the form e(T1n, T2n) = µ̇1(θ0)/σ1(θ0) µ̇2(θ0)/σ2(θ0) 1/δ [1.5] Proof. First let us consider one statistic and skip the indices i. Let us find limn→∞ βn(θn). By assumption 1 Pθ0 {Tn cn,α} = Pθ0 { Tn − µn(θ0) σn(θ0) cn,α − µn(θ0) σn(θ0) } → α so zn,α = cn,α − µn(θ0) σn(θ0) → zα
  • 36. 16 Non-parametric Tests for Complete Data By assumptions 2–4 µn(θn) − µn(θ0) σn(θ0) = µ̇n(θ0)hn−δ + o(1) n−δσ(θ0) + o(1) → µ̇(θ0) σ(θ0) h So using assumption 5 we have βn(θn) = Pθn {Tn cn,α} = Pθn { Tn − µn(θn) σn(θn) cn,α − µn(θn) σn(θn) } = Pθn { Tn − µn(θn) σn(θn) zn,α σn(θ0) σn(θn) − µn(θn) − µn(θ0) σn(θ0) σn(θ0) σn(θn) } → 1 − Φ zα − h µ̇(θ0) σ(θ0) Let T1n and T2n be two test statistics verifying the assumptions of the theorem and let θm = θ0 + h1 nδ 1m = θ0 + h2 nδ 2m be a sequence of approaching alternatives. The last equalities imply h2 h1 = ( n2m n1m )δ We have proved that βnim (θm) → 1 − Φ zα − hi µ̇i(θ0) σi(θ0) Choose n1m and n2m to give the same limit powers. Then n2m n1m = ( h2 h1 )1/δ = µ̇1(θ0)/σ1(θ0) µ̇2(θ0)/σ2(θ0) 1/δ △
  • 37. Chapter 2 Chi-squared Tests 2.1. Introduction Chi-squared tests are used when data are classified into several groups and only numbers of objects belonging to concrete groups are used for test construction. The vector of such random numbers has a multinomial distribution and depends only on the finite number of parameters. So chi- squared tests, being based on this vector, are parametric but are also used for non-parametric hypotheses testing, so we include them in this book. When the initial data are replaced by grouped data, some information is lost, so this method is used when more powerful tests using all the data are not available. 2.2. Pearson’s goodness-of-fit test: simple hypothesis Suppose that X = (X1, ..., Xn)T is a simple sample of a random variable X having the c.d.f. F from a non-parametric class F.
  • 38. 18 Non-parametric Tests for Complete Data Simple hypothesis H0 : F(x) = F0(x), ∀x ∈ R [2.1] where F0 is completely specified (known) cdf from the family F. The hypotheses H0 : X ∼ U(0, 1), H0 : X ∼ B(1, 0.5), H0 : X ∼ N(0, 1) are examples of simple non-parametric hypotheses. For example, such a hypothesis is verified if we want to know whether realizations generated by a computer are obtained from the uniform U(0, 1), Poisson P(2), normal N(0, 1) or other completely specified distribution. The data are grouped in the following way: the abscissas axis is divided into a finite number of intervals using the points −∞ = a0 a1 ... ak = ∞. Denote by Uj the number of Xi falling in the interval (aj−1, aj] Uj = n X i=1 1(aj−1,aj](Xi), j = 1, 2...., k So, instead of the fully informative data X, we use the grouped data U = (U1, . . . , Uk)T We can also say that the statistic U is obtained using a special data censoring mechanism, known as the mechanism of grouping data. The random vector U has the multinomial distribution Pk(n, π): for 0 ≤ mi ≤ n, P i mi = n P{U1 = m1, ..., Uk = mk} = n! m1!...mk! πm1 1 ...πmk k [2.2] where πi = P{X ∈ (ai−1, ai]} = F(ai) − F(ai−1) is the probability that the random variable X takes a value in the interval (ai−1, ai], π = (π1, . . . , πk)T , π1 + ... + πk = 1.
  • 39. Chi-squared Tests 19 Under the hypothesis H0, the following hypothesis also holds. Hypothesis on the values of multinomial distribution parameters H′ 0 : πj = πj0, j = 1, 2...., k [2.3] where πj0 = F0(aj) − F0(aj−1). Under the hypothesis H′ 0 U ∼ Pk(n, π0) where π0 = (π10, . . . , πk0)T , π10 + ... + πk0 = 1. If the hypothesis H′ 0 is rejected then it is natural also to reject the narrower hypothesis H0. Pearson’s chi-squared test for the hypothesis H′ 0 is based on the differences between the maximum likelihood estimators π̂j of the probabilities πj obtained from the grouped data U and the hypothetical values πj0 of these probabilities. The relation Pk j=1 πj = 1 implies that the multinomial model Pk(1, π) depends on (k − 1)-dimensional parameters (π1, ..., πk−1)T . From [2.2] the likelihood function of the random vector U is L(π1, ..., πk−1) = n! U1!...Uk! πU1 1 ...πUk k [2.4] The loglikelihood function is ℓ(π1, ..., πk−1) = k−1 X j=1 Uj ln πj + Uk ln(1 − k−1 X j=1 πj + C)
  • 40. 20 Non-parametric Tests for Complete Data so ℓ̇j = Uj πj − Uk 1 − Pk−1 j=1 πj = Uj πj − Uk πk which implies that for all j, l = 1, . . . k Ujπl = Ulπj Summing both sides with respect to l and using the relations Pk j=1 πj = 1 and Pk j=1 Uj = n, we obtain Uj = nπj, so the maximum likelihood estimators of the parameters πi are π̂ = (π̂1, ..., π̂k)T , π̂i = Ui/n, i = 1, ..., k The famous Pearson’s statistic has the form X2 n = k X i=1 ( √ n(π̂i − πi0))2 πi0 = k X i=1 (Ui − nπi0)2 nπi0 = 1 n k X i=1 U2 i πi0 − n. [2.5] Under the hypothesis H′ 0, the realizations of the differences π̂i − πi0 are scattered around zero. If the hypothesis is not true then at least one number i exists such that the realizations of the differences π̂i − πi0 are scattered around some positive or negative value. In such a case the statistic X2 n has a tendency to take greater values than under the zero hypothesis. Pearson’s test is asymptotic, i.e. it uses an asymptotic distribution of the test statistic X2 n, which is chi square, which follows from the following theorem. Theorem 2.1. If 0 πi0 1, π10 + · · · + πk0 = 1 then under the hypothesis H′ 0 X2 n d → χ2 k−1 as n → ∞ Proof. Under the hypothesis H′ 0, the random vector U = (U1, ..., Uk)T is the sum of iid random vectors Xi having the mean π0 and the covariance matrix D = [djj′ ]k×k; djj = πj0(1 − πj0), djj′ = −πj0πj′0, j 6= j′.
  • 41. Chi-squared Tests 21 If 0 πi0 1, π10 + · · · + πk0 = 1 then the central limit theorem holds √ n(π̂ − π0) = √ n(π̂1 − π10, . . . , π̂k − πk0)T d → Y ∼ Nk(0, D) [2.6] as n → ∞. The matrix D can be written in the form D = p0 − p0pT 0 where p0 is the diagonal matrix with elements π10, ..., πk0 on the main diagonal. Set Zn = √ np −1/2 0 (π̂ − π0) = √ n(π̂1 − π10) √ π10 , ..., √ n(π̂k − πk0) √ πk0 T The result [2.6] implies that Zn d → Z ∼ Nk(0, Σ) where Σ = p −1/2 0 Dp −1/2 0 = Ek − qqT where q = ( √ π10, ..., √ πk0)T , qT q = 1, and Ek is a k × k unit matrix. By the well known theorem on the distribution of quadratic forms [RAO 02] the limit distribution of the statistic ZT n Σ−Zn is chi-squared with Tr(Σ−Σ) degrees of freedom, where Σ− is the generalized inverse of Σ. Note that Σ− = Ek + qqT , Tr(Σ− Σ) = Tr(Ek − qqT ) = k − 1. So, using the equality ZT n q = 0, we have X2 n = ZT n Σ− Zn = ||Zn||2 d → ||Z||2 ∼ χ2 (k − 1) [2.7] △
  • 42. 22 Non-parametric Tests for Complete Data The theorem implies the following. Pearson’s chi-squared test: the hypothesis H′ 0 is rejected with an asymptotic significance level α if X2 n χ2 α(k − 1) [2.8] The hypothesis H′ 0 can also be verified using the equivalent likelihood ratio test, based on the statistic Λn = supπ=π0 L(π) supπ L(π) = L(π0) L(π̂) = nn k Y i=1 πi0 Ui Ui Under the hypothesis H′ 0 (see Appendix A, comment A3) Rn = −2 ln Λn = 2 k X i=1 Ui ln Ui nπi0 d → V ∼ χ2 (k − 1), as n → ∞ [2.9] So the statistics Rn and X2 n are asymptotically equivalent. Likelihood ratio test: hypothesis H′ 0 is rejected with an asymptotic significance level α if Rn χ2 α(k − 1) [2.10] Comment 2.1. Both given tests are not exact, they are used when the sample size n is large. The accuracy of the conclusions depends on the accuracy of the approximations [2.7, 2.9]. If the number of intervals is too large then Ui tend to take values 0 or 1 for all i and the approximation is poor. So the number of intervals k must not be too large. Rule of thumb: choose the grouping intervals to obtain nπi0 ≥ 5.
  • 43. Chi-squared Tests 23 Comment 2.2. If the accuracy of the approximation of the distribution of the statistic X2 n (or Rn) is suspected not to be sufficient then the P-value can be computed by simulation. Suppose that the realization of the statistic X2 n is x2 n. N values of the random vector U ∼ P(n, π0) are simulated and for each value of U the corresponding value of the statistics X2 n can be computed. Suppose that M is the number of values greater then x2 n. Then the P-value is approximated by M/N. The hypothesis H′ 0 is rejected with an approximated significance level α if M/N α. The accuracy of this test depends on the number of simulations N. Comment 2.3. If the hypothesis H′ 0 is rejected then the hypothesis H0 is also rejected because it is narrower. If the data do not contradict the hypothesis H′ 0 then we have no reason to reject the hypothesis H0. Comment 2.4. If the distribution of the random variable X is discreet and concentrated at the points x1, ..., xk then grouping is not needed in such a case. Ui is the observed number of the value xi. Comment 2.5. As was noted, the hypotheses H0 and H′ 0 are not equivalent in general. Hypothesis H′ 0 states that the increment of the cumulative distribution function in the j-th interval is πj0 but the behavior of the cumulative distribution function inside the interval is not specified. If n is large then the number of grouping intervals can be increased and thus the hypotheses become closer. Comment 2.6. If the hypothesis H′ 0 does not hold and U ∼ Pk(n, π) then the distributions of the statistics Rn and X2 n are approximately non-central chi-squared with k − 1 degrees of
  • 44. 24 Non-parametric Tests for Complete Data freedom and the parameter of non-centrality ∆ = 2n k X j=1 πj ln πj πj0 ≈ δ = n k X i=1 (πj − πj0)2 πj0 . [2.11] Example 2.1. Random number generator generated n = 80 random numbers. Ordered values are given in the following table: 0.0100 0.0150 0.0155 0.0310 0.0419 0.0456 0.0880 0.1200 0.1229 0.1279 0.1444 0.1456 0.1621 0.1672 0.1809 0.1855 0.1882 0.1917 0.2277 0.2442 0.2456 0.2476 0.2538 0.2552 0.2681 0.3041 0.3128 0.3810 0.3832 0.3969 0.4050 0.4182 0.4259 0.4365 0.4378 0.4434 0.4482 0.4515 0.4628 0.4637 0.4668 0.4773 0.4799 0.5100 0.5309 0.5391 0.6033 0.6283 0.6468 0.6519 0.6686 0.6689 0.6865 0.6961 0.7058 0.7305 0.7337 0.7339 0.7440 0.7485 0.7516 0.7607 0.7679 0.7765 0.7846 0.8153 0.8445 0.8654 0.8700 0.8732 0.8847 0.8935 0.8987 0.9070 0.9284 0.9308 0.9464 0.9658 0.9728 0.9872 Verify the hypothesis H0 that a realization from the uniform distribution U(0, 1) was observed. Divide the region of possible values (0, 1) into k = 5 intervals of equal length: (0; 0.2), [0.2; 0.4), ..., [0.8; 1) We have: Uj = 18, 12.16, 19, 15. Let us consider the wider hypothesis H′ 0 : πi = 0.2, i = 1, ..., 5. We obtain X2 n = 1 n k X i=1 U2 i πi0 −n = 1 80 182 + 122 + 162 + 192 + 152 0.2 −80 = 1.875 The asymptotic P-value is pva = P{χ2 4 1.875) = 0.7587. The data do not contradict the hypothesis H′ 0. So we have no basis
  • 45. Chi-squared Tests 25 for rejecting the hypothesis H0. The likelihood ratio test gives the same result because the value of the statistic Rn is 1.93 and pva = P{χ2 4 1.93} = 0.7486. Comment 2.7. Tests [2.8] and [2.10] are used not only for the simple hypothesis H0 but also directly for the hypothesis H′ 0 (see the following example). Example 2.2. Over a long period it has been established that the proportion of the first and second quality units produced in the factory are 0.35 and 0.6, respectively, and the proportion of defective units is 0.05. In a quality inspection, 300 units were checked and 115, 165 and 20 units of the above-considered qualities were found. Did the quality of the product remain the same? In this example U1 = 115, U2 = 165, U3 = 20, n = 300. The zero hypothesis is H′ 0 : π1 = 0.35, π2 = 0.60, π3 = 0.05 The values of the statistics [2.9] and [2.5] are Rn = 3.717, X2 n = 3.869 The number of degrees of freedom is k − 1 = 3 − 1 = 2. The P-values corresponding to the likelihood ratio and Pearson’s chi-squared tests are pv1 = P{χ2 2 3.717} = 0.1559 and pv2 = P{χ2 2 3.869} = 0.1445 respectively. The data do not contradict the zero hypothesis.
  • 46. 26 Non-parametric Tests for Complete Data 2.3. Pearson’s goodness-of-fit test: composite hypothesis Suppose that X = (X1, ..., Xn)T is a simple sample obtained by observing a random variable X with a cdf from the family P = {F : F ∈ F}. Composite hypothesis H0 : F(x) ∈ F0 = {F0(x; θ), θ ∈ Θ} ⊂ F [2.12] meaning that the cdf F belongs to the cdf class F0 of form F0(x; θ); here θ = (θ1, ..., θs)T ∈ Θ ⊂ Rs is an unknown s-dimensional parameter and F0 is a specified cumulative distribution function. For example, the hypothesis may mean that the probability distribution of X belongs to the family of normal, exponential, Poisson, binomial or other distributions. As in the previous section, divide the abscissas axis into k s + 1 smaller intervals (ai−1, ai] and denote by Uj the number of observations belonging to the j-th interval, j = 1, 2...., k. The grouped sample U = (U1, ..., Uk)T has a k-dimensional multinomial distribution Pk(n, π); here π = (π1, ..., πk)T πi = P{X ∈ (ai − 1, ai]} = F(ai) − F(ai−1), F ∈ F If the hypothesis H0 is true then the wider hypothesis H′ 0 : π = π(θ), θ ∈ Θ also holds; here π(θ) = (π1(θ), ..., πk(θ))T , πi(θ) = F0(ai; θ) − F0(ai−1; θ) [2.13]
  • 47. Chi-squared Tests 27 The last hypothesis means that the parameters of the multinomial random vector U can be expressed as specified functions [2.13] of the parameters θ1, ..., θs; s + 1 k. The Pearson’s chi-squared statistic (see [2.5]) X2 n(θ) = k X i=1 (Ui − nπi(θ))2 nπi(θ) = 1 n k X i=1 U2 i πi(θ) − n [2.14] cannot be computed because the parameter θ is unknown. It is natural to replace the unknown parameters in the expression [2.14] by their estimators and to investigate the properties of the obtained statistic. If the maximum likelihood estimator of the parameter θ obtained from the initial non-grouped data is used then the limit distribution depends on the distribution F(x; θ) (so on the parameter θ). We shall see that if grouped data are used then estimators of the parameter θ can be found such that the limit distribution of the obtained statistic is chi-squared with k − s − 1 degrees of freedom, so does not depend on θ. Let us consider several examples of such estimators. 1) Under the hypothesis H0, the likelihood function from the data U and its logarithm are L̃(θ) = n! U1! . . . Uk! k Y i=1 πUi i (θ), ℓ̃(θ) = k X i=1 Ui ln πi(θ) + C [2.15] so the maximum likelihood estimator θ∗ n of the parameter θ from the grouped data verifies the system of equations ∂ℓ̃(θ) ∂θj = k X i=1 Ui πi(θ) ∂πi(θ) ∂θj = 0, j = 1, 2...., s [2.16]
  • 48. 28 Non-parametric Tests for Complete Data Let us define the Pearson statistic obtained from [2.14]. Replacing θ by θ∗ n in [2.14] we obtain the following statistic: X2 n(θ∗ n) = k X i=1 (Ui − nπi(θ∗ n))2 nπi(θ∗ n) [2.17] 2) Another estimator θ̃n of the parameter θ, called the minimum chi-squared estimator, is obtained by minimizing [2.14] with respect to θ X2 n(θ̃n) = inf θ∈Θ X2 n(θ) = inf θ∈Θ k X i=1 (Ui − nπi(θ))2 nπi(θ) [2.18] 3) To find the estimator θ̃n, complicated systems of equations must be solved, so statistic [2.14] is often modified by replacing the denominator by Ui. This method is called the modified chi-squared minimum method. The estimator θ̄n obtained by this method is found from the condition X2 n(θ̄n) = inf θ∈Θ k X i=1 (Ui − nπi(θ))2 Ui [2.19] Besides these three chi-squared type statistics [2.17–2.19] we may use the following. 4) The likelihood ratio statistic obtained from the grouped data Rn = −2 ln supθ∈Θ L̃(θ) supπ L(π) = −2 ln supθ∈Θ Qk i=1 πUi i (θ) supπ Qk i=1 πUi i = 2 k X i=1 Ui ln Ui nπi(θ∗ n)
  • 49. Chi-squared Tests 29 This statistic can be written in the form Rn = Rn(θ∗ n) = inf θ∈Θ Rn(θ), Rn(θ) = 2 k X i=1 Ui ln Ui nπi(θ) [2.20] We shall show that the statistics X2(θ̃n), X2(θ̄n), X2(θ∗ n) and Rn(θ∗ n) are asymptotically equivalent as n → ∞. Suppose that {Yn} is any sequence of random variables. We write Yn = oP (1), if Yn P → 0, and we write Yn = OP (1), if ∀ ε 0 ∃ c 0 : sup n P{|Yn| c} ε Prokhorov’s theorem [VAN 00] implies that if ∃Y : Yn d → Y , n → ∞, then Yn = OP (1). Conditions A 1) For all i = 1, . . . , k and all θ ∈ Θ 0 πi(θ) 1, π1(θ) + · · · + πk(θ) = 1 2) The functions πi(θ) have continuous first- and second- order partial derivatives on the set Θ. 3) The rank of the matrix B = ∂πi(θ) ∂θj k×s , i = 1, ..., k, j = 1, ..., s, is s. Lemma 2.1. Suppose that the hypothesis H0 holds. Then under conditions A the estimators πi(θ̃n), πi(θ̄n) and πi(θ∗ n) are √ n- consistent, i.e. √ n(π̃in − πi) = OP (1).
  • 50. 30 Non-parametric Tests for Complete Data Proof. For brevity we shall not write the argument of the functions πi(θ). Let us consider the estimator π̃in. Since 0 ≤ π̃in ≤ 1, we have π̃in = OP (1). Since Ui/n P → πi,, the inequalities (we use the definition of the chi-squared minimum estimator) k X i=1 (Ui/n − π̃in)2 ≤ k X i=1 (Ui/n − π̃in)2 π̃in ≤ k X i=1 (Ui/n − πi)2 πi = oP (1) imply that for all i: Ui/n − π̃in = oP (1) and π̃in − πi = (π̃in − Ui/n) + (Ui/n − πi) = oP (1) Since √ n(π̂i − πi) d → Zi ∼ N(0, πi(1 − πi)), we have (Ui − nπi)/ √ n = √ n(π̂i − πi) = OP (1). So from the inequality k X i=1 (Ui − nπ̃in)2 n ≤ k X i=1 (Ui − nπ̃in)2 nπ̃in ≤ k X i=1 (Ui − nπi)2 nπi = OP (1) we have that for all i: (Ui − nπ̃in)/ √ n = OP (1), and √ n(π̃in − πi) = nπ̃in − nπi √ n = nπ̃in − Ui √ n + Ui − nπi √ n = OP (1) Analogously we obtain k X i=1 (Ui − nπ̄in)2 Ui ≤ k X i=1 (Ui − nπi)2 Ui ≤ k X i=1 ( √ n(Ui/n − πi))2 Ui/n = OP (1) and √ n(π̄in − πi) = nπ̄in − Ui √ n + Ui − nπi √ n = OP (1)
  • 51. Chi-squared Tests 31 Let us consider the estimator π∗ in. θ∗ n is the ML estimator so under the conditions of the theorem the sequence √ n(θ∗ n − θ) has the limit normal distribution. By the delta method [VAN 00] √ n(π∗ in − πi) = √ n(πi(θ∗ n) − πi(θ)) = π̇T i (θ) √ n(θ∗ n − θ) + oP (1) = OP (1) △ Theorem 2.2. Under conditions A the statistics X2(θ̃n), X2(θ̄n), X2(θ∗ n) and Rn(θ∗ n) are asymptotically equivalent as n → ∞: X2 (θ̃n) = X2 (θ̄n) + oP (1) = X2 (θ∗ n) + oP (1) = Rn(θ∗ n) + oP (1) The distribution of each statistic converges to the chi-squared distribution with k − s − 1 degrees of freedom. Proof. Suppose that θ̂ is an estimator of θ such that π̂n = (π̂1n, . . . , π̂kn)T = (π1(θ̂), . . . , πk(θ̂))T is the √ n-consistent estimator of the parameter π. From the definition of √ n-consistency and the convergence Ui/n P → πi we have that for all i π̂in − Ui n = oP (1), √ n π̂in − Ui n = OP (1), Ui n = πi + oP (1) Using the last inequalities, the Taylor expansion ln(1 + x) = x − x2 /2 + o(x2 ), x → 0 and the equality U1 + ... + Uk = n, we obtain 1 2 Rn(θ̂n) = k X i=1 Ui ln Ui nπ̂i = − k X i=1 Ui ln 1 + nπ̂i Ui − 1 =
  • 52. 32 Non-parametric Tests for Complete Data − k X i=1 Ui ln 1 + π̂i − Ui/n Ui/n = − k X i=1 Ui π̂i − Ui/n Ui/n + 1 2 k X i=1 Ui π̂i − Ui/n Ui/n 2 + k X i=1 UioP π̂i − Ui/n Ui/n 2 ! = −n k X i=1 π̂i + k X i=1 Ui + 1 2 k X i=1 (Ui − nπ̂i)2 Ui + oP (1) = 1 2 k X i=1 (Ui − nπ̂i)2 Ui + oP (1) = 1 2 k X i=1 (Ui − nπ̂i)2 nπ̂i − 1 2 k X i=1 (Ui − nπ̂i)3 Uinπ̂i + oP (1) = 1 2 k X i=1 (Ui − nπ̂i)2 nπ̂i + oP (1) = 1 2 X2 n(θ̂n) + oP (1) Taking θ̂n = θ∗ n and θ̂n = θ̃n, the last equalities imply X2 n(θ∗ n) = Rn(θ∗ n) + oP (1), X2 n(θ̃n) = Rn(θ̃n) + oP (1) From the definition of θ̃n we obtain X2 n(θ̃n) ≤ X2 n(θ∗ n), and from definition [2.20] of Rn we obtain Rn(θ∗ n) ≤ Rn(θ̃n). So X2 n(θ̃n) ≤ X2 n(θ∗ n) = Rn(θ∗ n) + oP (1) ≤ Rn(θ̃n) + oP (1) = X2 n(θ̃n) + oP (1) These inequalities imply X2 n(θ̃n) = Rn(θ∗ n) + oP (1) Analogously X2 n(θ̄n) = Rn(θ∗ n) + oP (1) The (k − 1)-dimensional vector (π1, . . . , πk−1)T is a function of the s-dimensional parameter θ, so the limit distribution of
  • 53. Chi-squared Tests 33 the likelihood ratio statistic Rn(θ∗ n) is chi-squared with k−s−1 degrees of freedom (see Appendix A, comment A.4). The same limit distributions have other considered statistics. △ The theorem implies the following asymptotic tests. Chi-squared test: hypothesis H′ 0 is rejected with an asymptotic significance level α if X2 (θ̂n) χ2 α(k − 1 − s) [2.21] here θ̂n is any of the estimators θ̃n, θ∗ n, θ̄n. Likelihood ratio test: hypothesis H′ 0 is rejected with an asymptotic significance level α if Rn(θ∗ n) χ2 α(k − 1 − s) [2.22] If hypothesis H′ 0 is rejected then the hypothesis H0 is also rejected. Example 2.3. In reliability testing the numbers of failed units Ui in time intervals [ai−1, ai), i = 1, ..., 11, were fixed. The data are given in the following table. i (ai−1, ai] Ui i (ai−1, ai] Ui 1 (0,100] 8 7 (600,700 ] 25 2 (100,200] 12 8 (700,800] 18 3 (200,300] 19 9 (800,900] 15 4 (300,400] 23 10 (900,1000] 14 5 (400,500] 29 11 (1000,∞) 18 6 (500,600] 30
  • 54. 34 Non-parametric Tests for Complete Data Verify the hypothesis stating that the failure times have the Weibull distribution. By [2.20], the estimator (θ∗ n, ν∗ n) minimizes the function Rn(θ, ν) = 2 k X i=1 Ui ln Ui nπi(θ, ν) , πi(θ, ν) = e−(ai−1/θ)ν − e−(ai/θ)ν By differentiating this function with respect to the parameters and equating the partial derivatives to zero, the following system of equations is obtained for the estimators θ∗ n and ν∗ n k X i=1 Ui aν i−1e−(ai−1/θ)ν − aν i e−(ai/θ)ν e−(ai−1/θ)ν − e−(ai/θ)ν = 0 k X i=1 Ui aν i−1 e−(ai−1/θ)ν ln ai−1 − aν i e−(ai/θ)ν ln ai e−(ai−1/θ)ν − e−(ai/θ)ν = 0 By solving this system of equations or directly minimizing the function Rn(θ, ν), we obtain the estimators θ∗ = 649.516 and ν∗ = 2.004 of the parameters θ and ν. Minimizing the right- hand sides of [2.18] and [2.19] we obtain the values of the estimators θ̃ = 647.380, ν̃ = 1.979 and θ̄ = 653.675, ν̄ = 2.052. Using the obtained estimators, we have the following values of the test statistics Rn(θ∗ , ν∗ ) = 4.047, X2 n(θ∗ , ν∗ ) = 4.377, X2 n(θ̃, ν̃) = 4.324 and X2 n(θ̄, ν̄) = 3.479 The number of degrees of freedom is k − s − 1 = 8. Since the P-values – 0.853; 0.822; 0.827; 0.901 – are not small, there is no reason to reject the zero hypothesis. 2.4. Modified chi-squared test for composite hypotheses The classical Pearson’s chi-squared test has drawbacks, especially in the case of continuous distributions.
  • 55. Another Random Document on Scribd Without Any Related Topics
  • 56. lies is merely selling, thus not requiring proof of guilty knowledge. It has been contended that the requirement of guilty knowledge in 8 Geo. II. c. 13, should be read into 17 Geo. III. c. 57, and the action of damages provided by the latter statute applied to guilty selling only. This contention has been rejected as erroneous.[848] Limitation of Action.—Actions for penalties under the Acts must be brought within three months of the discovery of the offence sued on[849] and within six months after the committal of such offence. [850] There is no express limitation in the Acts in respect of actions for damages under 17 Geo. III. c. 57, and therefore such action will not be barred for six years.[851] Costs.—The litigant if successful in an action for infringement is to recover full costs.[852] This proviso, however, has been construed to mean nothing more than ordinary costs taxed as between party and party.[853] Probably, however, they may be claimed as of right and are not in the discretion of the Court under Rules of the Supreme Court, o. 65, r. 1.[854] Copying for Private Use will probably be actionable under 17 Geo. III. c. 57;[855] but no penalties could be recovered under 8 Geo. II. c. 13, as under that Act the making must be a making for sale. What is a Piratical Copy.—The right under the Acts is the sole right and liberty of printing and reprinting the same,[856] and the prohibition is against engraving, etching, or working in mezzotinto or chiaro oscuro or otherwise, or in any manner copying, in the whole or in part, by varying, adding to or diminishing from, the main design.[857] [157]
  • 57. The taking of a material part is a piracy;[858] the copy which contains a material part of a copyright engraving is a piratical copy, and it is an offence to import or sell it.[859] The copyright in an engraving may be infringed otherwise than by another engraving. Thus a photograph of an engraving is an infringement of the copyright in it.[860] It is doubtful how far the Engraving Acts protect the design in an engraving. It is clear that when an engraving is taken from a work of art previously existing, such as a pen and ink drawing or a painting, the engraving is only copyright so far as the work of the engraver[861] is concerned; that is to say, apart from the copyright in the drawing or painting, which may or may not be his, the engraver acquires no monopoly[862] of the right to engrave the picture; the fact of his being the first engraver does not prevent others from doing the same, they can only be prevented from copying from his engraving the peculiar execution of the design. In Dicks v. Brooks[863] a printed pattern for Berlin wool work was taken from an engraving of the well-known picture The Huguenot, by Millais. The owner of the copyright in the engraving sued for infringement. It was held that the printed pattern constituted no infringement of his engraving; it contained no reproduction of that which was the engraver's meritorious work in the print. But if the whole invention and design of the engraving is the engraver's own do the Engraving Acts protect the engraver in such design and invention? There is no authority where the point has been expressly considered and decided. It is suggested that the Engraving Acts protect that part of an engraving only which is the result of the engraver's peculiar art; for the rest, for the design, for the invention, for the grouping of the figures, protection can only be obtained under the Act protecting drawings, or (in the case of maps) under the Literary Copyright Act, or at common law. In Roworth v. Wilkes[864] Lord Ellenborough considered a copying of the design was an infringement of copyright [158]
  • 58. under the Engraving Acts. The action was in respect of an alleged infringement of certain plates in a treatise on fencing. These plates had been copied in so far as the position of the figures went, but they were represented as differently dressed. His Lordship, in directing the jury, said: As to the prints, the question will be whether the defendant has copied the main design ... it is still to be considered whether there be such a similitude and conformity between the prints that the person who executed the one set must have used the others as a model. In that case he is a copyist of the main design. But if the similitude can be supposed to have arisen from accident, or necessarily from the nature of the subject, or from the artist having sketched designs merely from reading the letterpress of the plaintiffs work, the defendant is not answerable. It is remarkable, however, that he has given no evidence to explain the similitude or to repel the presumption which that necessarily causes. In Martin v. Wright[865] it was held that when an artist had from sketches of his own produced an engraving, and the defendant had it copied on canvas in colours on a very large scale, with dioramic effect, and publicly exhibited it, such a copying and exhibiting was no infringement of the engraving. The ground of this decision seems to have been partly that the merit of the new work had absorbed the merit of the old. Thus Shadwell, V. C., prefaces his judgment with the remark that any person may copy and publish the whole of a literary composition provided he writes notes upon it, so as to present it to the public connected with matter of his own.[866] Another ground of the decision seems to have been that the diorama was produced for purposes of exhibition and not of sale. The real point, whether the Acts protected more than that which was peculiar to the engraver's art, does not appear to have been considered either in the argument or judgment. In Dicks v. Brooks[867] James, L. J., appears to have been of opinion that 8 Geo. II. c. 3, in [159]
  • 59. protecting the work of an engraver where the invention and design was his own, protected not only the work peculiar to the engraver's art, but the invention and design of the pictures as well. These words were intended to give protection for the genius exhibited in the invention of the design, and the protection was commensurate with the invention and design.[868] Bramwell, L. J., however, seems inclined towards the opposite view. He says: I do not say that if this were an ordinary engraving with no picture, a lithograph taken from it would not be a copy. I think that a photograph taken from it would be a copy. I do not say that if this were an original engraving with no picture, and a copy were made of it and afterwards coloured there might not be some ground for saying that there was a piracy of the art and skill of the engraver. I should have very great misgiving about it, because I doubt whether the statutes were not intended to protect the artist's skill as an engraver only, and not as a draftsman.[869] It is no defence to an action for infringement that the work has been extensively added to or improved.[870] Striking prints from the proprietor's own plate has been held not to be an infringement, although it was clearly an unauthorised act and a breach of contract.[871] Thus a printer who had plates in his possession would not infringe the copyright and be liable to penalties by striking copies for his own use, but he would be liable in damages for breach of contract. Licence a Defence.—A licence in order to be a defence must be in writing signed by the proprietor in the presence of two or more credible witnesses,[872] but a licensee who is also a purchaser of any plates for printing may presumably without any document in writing
  • 60. print from the said plates without incurring penalties[873] under 8 Geo. II. c. 13 or 7 Geo. III. c. 38, but quære whether such purchaser would not technically be liable to damages under 17 Geo. III. c. 57. A bare licensee, although a purchaser of plates, could not authorise third persons to print from the plates except as his agent and on his behalf.[874] [160] [161]
  • 61. CHAPTER VII COPYRIGHT IN SCULPTURE Section I.—What Works are Protected. The following works are protected under the Sculptures Act: 1. Every original sculpture:[875] 2. First published within the British dominions:[876] 3. [The author of which is a British subject or resident within the British dominions]:[877] 4. Which bears the proprietor's name and the date [of first publication] thereon:[878] 5. And is innocent.[879] Protection endures for fourteen years from publication, and another term of fourteen years if the author is then alive and retains the copyright.[880] Protection is probably limited by implication to the United Kingdom. [881] What is an Original Sculpture.—The work protected is any new and original sculpture, or model, or copy, or cast of the human figure or human figures, or of any bust or busts or of any part or parts of the human figure clothed in drapery or otherwise, or of any subject
  • 62. being matter of invention in sculpture, or of any alto or basso-relievo representing any of the matters or things hereinbefore mentioned, or any cast from nature of the human figure or of any part or parts of the human figure, or of any cast from nature of any animal or of any part or parts of any animal, or of any such subject containing any of the matters or things hereinbefore mentioned, whether separate or combined.[882] In one case it was contended that the Act only applied to representations of human figures and animals. North, J., however, held that any new and original sculpture applied to any subject being matter of invention in sculpture, and that casts of fruit and leaves used for instruction in drawing were protected.[883] Carefully modelled toy soldiers have been protected as works of sculpture.[884] The Sculpture must be First Published within the British Dominions.—The Act provides that protection shall run from the first publication of the work.[885] Before 1886 it is possible that first publication within the United Kingdom was required, now first publication anywhere within the British dominions will vest the copyright;[886] first publication outside the British dominions will destroy it.[887] Publication.—A work of sculpture is published when the eye of the public[888] is allowed to rest upon it, that is to say when the sculpture itself and not merely a photographic copy or sketch is so exhibited that the general public have an opportunity of viewing it. [889] Exhibition in any public gallery such as the Royal Academy would be publication; but a private view in the artist's studio would not be publication. Author's Nationality.—It is extremely doubtful whether the author must not at the time of first publication bear some allegiance to the [162]
  • 63. crown by virtue of nationality or residence. If this is so in the case of books,[890] there seems to be no good ground for saying that the statute as to sculpture[891] was intended to be more generous to the foreigner than that as to books.[892] Proprietor's Name and Date.—The protection given by the Sculpture Act is conditional on the proprietor or proprietors having caused his, her, or their name or names with the date to be put on every sculpture before the same shall be put forth or published.[893] Proprietor's Name.[894]—As to what will probably be a sufficient statement of the proprietor's name, see the cases on engravings[895] on which also the proprietor's name is required. As to this provision the two statutes seem to be in pari materia and the cases equally applicable to both. Date.—It is not stated what date: but there can be no reasonable doubt but that the date of first publication is intended. The older statute governing sculptures[896] (now repealed) required the proprietor's name and date of publication. The International Act, 7 8 Vict. c. 12, in reciting the provisions as to sculptures, runs and by the said Acts[897] it is provided that the name of the proprietor, with the date of first publication thereof, is to be put on all such sculptures. It should be noticed, however, that both statutes were then in operation and 38 Geo. III. c. 71 had not yet been repealed, so that the recitation in 7 8 Vict. c. 12 may apply only to the provision in 38 Geo. III. c. 71, and is not necessarily explanatory of 54 Geo. III. c. 36. There can be no doubt, however, that the omission in 54 Geo. III. c. 56 to state what date was required was an oversight, and everything points to its being the date of first publication that is meant. The statutory protection begins then, and from then the duration of the copyright is measured so that there is strong reason for the public being apprised of the date of first publication, while the date of making, which is the only other [163]
  • 64. conceivable date, is of no importance. When the date affixed was a date a few days before publication, Wright, J., held it was immaterial, as it would only shorten the term of the copyright.[898] Immoral Works.—Profane, libellous, or indecent works will not be protected. There are no direct authorities in respect of unlawful works of sculpture, but as in books,[899] paintings,[900] and engravings[901] the general policy of the law not to take an account between wrong-doers will apply. Duration of Protection.—Statutory protection commences on publication.[902] Before publication the unpublished work will be protected at common law from any use which may be made of it without the permission of the owner. After publication the statutory protection alone exists and subsists for fourteen years[903] with a further term of fourteen years if at the expiration of the first term the person who originally made or caused the sculpture to be made is alive and has not parted with the copyright.[904] Section II.—The Owner of the Copyright. The Artist.—If a work of sculpture is made by an artist on his own behalf he becomes on publication the proprietor of the copyright if before publication he has not assigned his interest in the work. The Employer.—If one procures an artist to make a work of sculpture for him the employer will be ab initio the owner of the copyright without any necessity for assignment from the artist. In order so to vest the work the employer, it would seem, requires to take no part in the invention or design of the work. If he causes the work to be done, he comes within the Act. No valuable consideration need be shown. [164]
  • 65. The Assignee.—Assignment must be under seal, i. e. by a deed in writing signed by the proprietor in the presence of and attested by two or more credible witnesses.[905] Section III.—Infringement of the Copyright. Prohibited Acts and Remedies.—The Act (54 Geo. III. c. 56) gives to the proprietor the sole right and property of works in sculpture. The prohibited Acts are[906]— 1. Making a pirated copy. 2. Importing a pirated copy. 3. Exposing for sale or otherwise disposing of a pirated copy. 4. Causing any of these acts to be done. The remedy is an action at the suit of the proprietor for[907]— i. Damages. ii. Injunction. iii. Costs—a full and reasonable indemnity.[908] Guilty Knowledge.—Ignorance is no defence to an action in respect of any of the prohibited Acts, even that of selling. Limitation of Action.—All actions under the Act must be commenced within six months of the discovery of the offence sued on. Copying for Private Use.—Either making or importing a single copy for private use would technically be an infringement. The prohibition [165]
  • 66. is not limited to making or importing for sale, hire, exhibition, or distribution, as in the case of paintings, c., under 25 26 Vict. c. 68, sec. 6. What is a Piratical Copy.—A pirated copy may be produced by moulding or copying from or imitating in any way any of the matters or things put forth or published under the protection of the Act ... to the detriment, damage, or loss of the proprietor.[909] The prohibition is against imitating in any way. This prohibition does not seem so wide as that in 25 26 Vict. c. 68, which prohibits the multiplication of a painting or drawing or the design thereof. It is more similar to the prohibition in the Engraving Act 8 Geo. II. c. 13, viz., against engraving, c., or in any manner copying a copyright print. It seems therefore to be open to question as with engravings whether a piece of sculpture can be infringed except by some work of art which reproduces the peculiar art of the sculptor. Would a piece of sculpture be infringed by a picture, sketch, or engraving copying the design of the work? Licence would be a defence, and it probably does not require to be in writing. There is nothing in the Act from which the necessity for a licence to be in writing could be implied. [166] [167]
  • 67. CHAPTER VIII COPYRIGHT IN PAINTINGS, DRAWINGS, AND PHOTOGRAPHS Section I.—What Works are Protected. The following works are protected under the Fine Arts Copyright Act, 1862: 1. Every original painting, drawing, and photograph:[910] 2. Not first published outside the British Dominions:[911] 3. The author of which is a British subject, or is resident within the dominions of the crown [when the work is made]:[912] 4. Which has been registered before infringement:[913] 5. And is innocent.[914] Protection vests at the date of making, and endures for the author's life and seven years.[915] Protection is limited to the United Kingdom.[916] Every Original Painting, Drawing, and Photograph.—There is no attempt to define what is a painting, drawing, or photograph
  • 68. within the meaning of the Act.[917] The substances used in the making are no doubt immaterial, so long as the result is ejusdem generis with what is ordinarily meant by a picture, drawing, or photograph. A painting on the wall of a house would doubtless be protected, but not a design created by grouping figures in a tableau vivant.[918] Originality as an essential of protection means that there must be something either in the design or execution of the work which is not merely copied from some other artistic work. The whole work need not be original. Thus the execution may be original but not the design, as in the case of a photograph of an old picture;[919] or part only of the design may be original, as in the case of the design of an old drawing added to or altered. In so far as the work is new there will be protection, but in so far as it is old there will be no protection.[920] Artistic Merit.—The Court will not inquire as to whether a painting, drawing, or photograph is good, bad, or indifferent. If it consists in the representation of some object by means of light and shade or colour, it will suffice, and even the coarsest or the most commonplace, or the most mechanical representation of the commonest object would be protected so that an exact reproduction of it, such as photography, for instance, would produce, would be an infringement of copyright.[921] Probably there must be a representation of some concrete object, real or imaginary. Protection, for instance, was refused to a label for Eau de Cologne, [922] which merely bore the legend Johanna Maria Farina gegenüber dem Julichs Platz, written in copperplate with sundry dots and flourishes. It was held that any one who had a right to sell Farina's Eau de Cologne might manufacture and use the label, since although the label was a trade mark there was no copyright in it. A label with anything in the nature of a picture on it would undoubtedly be copyright, as the use to which a work of art is put is immaterial, but it is doubtful whether a label containing merely [168]
  • 69. geometrical figures and fancy dots and lines would be protected under the Act of 1862. Probably it would not. Publication outside the British Dominions.—Copyright in works of art under the Act of 1862 begins on the making thereof, and is not dependent on publication. It is immaterial where the work is made, whether in the British dominions or elsewhere, and it would be as immaterial where it was first published, or whether it was published or not, but for the provision of the International Copyright Act, 1844. Section 19 of this Act provides that the maker of a work of art which shall be first published out of the British dominions shall not have copyright therein otherwise than such as he may become entitled to under the International Acts; which means that where there is no treaty a work first published abroad is not protected at all. The result of this section was evidently not contemplated when the Fine Arts Act, 1862, was framed. There seems to be no doubt that the work, wherever made, will acquire copyright immediately on the making, but that that copyright may be lost if the work is published abroad before it is published in the British dominions. Published.—A painting, drawing, or photograph is probably published when it is so exhibited as to give the public an opportunity of viewing it. The leading case on publication of works of art is Turner v. Robinson[923] in the Court of Chancery in Ireland. This case was decided before 1862, and therefore before there was any statutory copyright in paintings. The subject-matter was a painting from which certain stereoscopic views had been taken without the proprietor's consent. The painting had been previously, with the consent of the proprietor, published in the form of an engraving in a magazine, and exhibited at the Royal Academy in London and in Manchester. It was then exhibited with the proprietor's consent in Dublin for the purpose of obtaining contributors to a proposed engraving, and while so exhibited the defendant, without consent, copied it and produced his stereoscopic photographs. The Master of the Rolls[924] thought that the picture had never been published, because the [169]
  • 70. exhibitions to the public in the Academies and in Dublin were on the condition that no copies should be taken, and the engraving in the magazine was not a publication of the picture, but only of a rough representation of it. He therefore held that the common law right in the picture had not been lost by publication, and that the proprietor could recover against the taker of the stereoscopic views as against an infringer of common law rights. The Court of Appeal in Chancery upheld the judgment of the Master of the Rolls, but on different grounds. They said it was unnecessary to decide whether there had been publication in London and Manchester since, in their opinion, the act of the defendant in taking stereoscopic views from the painting was a breach of faith. He was admitted to the view in Dublin for one purpose only, i. e. to become if he wished a subscriber to an engraving; but he abused his privilege by taking a copy of the painting which might well compete with the plaintiff's proposed engraving. The defendant was, therefore, restrained on the ground of breach of faith or implied contract. In his judgment the Lord Chancellor disapproved of the view of the Master of the Rolls that there had been no publication in London or Manchester. He thought exhibition in the Academy, even although to a certain extent conditional, would be sufficient publication to vest the copyright, e. g. in a work of sculpture under the statutes applicable to such works. Exhibition in a public gallery, therefore, would be publication, but not a private view in the artist's studio to which only a small and selected portion of the public are invited. Whether the publication of a print would be publication of the picture from which it was taken, quære; the Master of the Rolls thought not, and on this point the Court of Appeal neither approved nor disapproved. Nationality or Residence of Artist.—The protection of the Act is expressly limited to the works of British subjects and of such foreigners as are resident within the dominions of the Crown.[925] There is no direction in the statute as to the time when the author must possess the requisite nationality or residence. Must it be at the time of making or at the time of publishing, or both? It is submitted [170]
  • 71. that it must be at the time of making, since copyright in the work vests at that time, and there may never be publication at all. There seems to be no reason for suggesting that the date to be looked at is the date of publication, except that the next words in the section provide that the work may be made anywhere, and the proviso as to the residence of the author, if applied at the date of making, means that— 1. A work by a British subject may be made anywhere; but, 2. A work by an alien must be made within the dominions of the Crown. There does not seem to be anything absurdly contradictory in this, and there is, on the other hand, a patent absurdity in not being able to determine whether the author is an author within the Act until long after the right has begun to run. Registration.—A condition precedent to protection is registration in the book kept at the Hall of the Stationers' Company. The Requisite Entry.—There must be registered: 1. Name and place of abode of the author. 2. Name and place of abode of the proprietor. 3. Short description of the nature and subject of the work. And if desired, 4. A sketch outline or photograph of the work. The wording of section 4 of the Act of 1862 providing for compulsory registration is very confused, the requirements on first registration being unaccountably mixed up with the requirements on subsequent assignment. [171]
  • 72. On first registration whenever it takes place it is submitted that the particulars entered should be as above.[926] The author and proprietor may very likely be the same individual, in which case the one name will be entered twice, once under each description. It would probably not be sufficient merely to enter the author's name once as author and leave it to be implied that he is the owner. Even if the author and proprietor are different persons, either because the author has been employed for valuable consideration or because he has granted an assignment, the particulars to be entered on first registration are the same, no entry of the terms of employment or assignment being necessary.[927] The real proprietor must be on the register, and if the wrong person is registered as proprietor it will not give a cause of action to join such person as co-plaintiff with the real proprietor who is not on the register.[928] As in the Literary Copyright Act, copyright in the work exists before registration, but no action is maintainable without registration, and under this Act even after registration there is no remedy in respect of infringement committed before registration.[929] It need hardly be said that the necessity of registration only applies to an action on copyright proper, and an action will without registration lie on breach of contract, express or implied,[930] and probably on the common law right of an author and his assigns in unpublished work.[931] If an unauthorised copy is made before the proprietor is registered but sold afterwards, an action for damages will lie for the offence of selling such copies, but no action for penalties.[932] No action at all will lie for making.[933] If an action is brought by an assignee, such assignee must be on the register as proprietor,[934] and it will not avail to join as co-plaintiff an unregistered assignee with the assignor who although registered [172]
  • 73. has parted with the copyright.[935] An assignee taking from a registered assignor probably cannot sue in respect of acts of infringement committed before the registration of the assignment. [936] It is not necessary that the original proprietor, whether author or employer, should have been registered,[937] but once registration has been effected it would seem that all future assignments must be entered on the register.[938] The registration by an assignee under an assignment, subsequent to first registration, must contain the following particulars:[939] 1. Date of assignment. 2. Names of parties to the assignment. 3. Name and place of abode of the assignee. 4. Name and place of abode of the author. 5. Short description of nature and subject of the work. And if desired, 6. A sketch outline or photograph of the work. The enactments of 5 6 Vict. c. 45 (the Literary Copyright Act) as to 1. Keeping the Register Book; 2. Searches and certified copies therefrom; 3. False entries; 4. Application to expunge, apply mutatis mutandis to registration of paintings, drawings, and photographs. The charge for making an entry is one shilling. [173]
  • 74. Name.—The trading style of a firm is a sufficient registration of the name of a proprietor. Place of Abode.—The place where a man can readily be found on inquiry is sufficient. A business address is a place of abode within the statute. Short Description of the Nature and Subject of the Work.—The title of the work will sometimes be a sufficient description. The following were held sufficient descriptions of Sir John Millais' well-known pictures, viz.: Painting in oil, 'Ordered on Foreign Service'; Painting in oil, 'My First Sermon'; Photograph, 'My Second Sermon.'[940] Blackburn, J., said: It is the object of the legislature that enough be stated to identify the production, and that the registration must be bonâ fide, that a man shall not first claim one thing and then sue for another. The description must be such as shall earmark the subject.... The picture 'Ordered on Foreign Service' represents an officer who is ordered abroad taking leave of a lady, and no one can doubt that is the picture intended.... There may be a few instances in which the mere registration of the name of the picture is not sufficient: for instance, Sir Edwin Landseer's picture of a Newfoundland dog might possibly be insufficiently registered under the description of 'A Distinguished Member of the Humane Society.' So also of a bullfinch and a couple of squirrels described as 'Piper and a Pair of Nut-crackers.' ... It would be advisable for a person proposing to register to add a sketch or outline of the work.[941] In the learned judge's opinion deficient description although it would not be sufficient in itself, may be made sufficient by the addition of a photograph, sketch, or outline. It would seem, however, that there must be a description of some kind, and that a photograph or sketch would not by itself be sufficient. [174]
  • 75. Immoral Works.—There will be no copyright in profane, libellous, or indecent[942] works of art. Duration of Protection.—The copyright under the Fine Arts Act endures for the term of the natural life of the author and seven years after his death.[943] Copyright will cease if and when any painting or drawing or the negative of any photograph is sold by the first owner thereof without either the express reservation in writing of such copyright to the vendor signed by the vendee or his agent, or the express assignment in writing of such copyright to the vendee signed by the vendor or his agent.[944] The copyright will also cease (probably) if the work is published out of the British dominions before publication within the dominions.[945] Section II.—The Owner of the Copyright. The Author.—The copyright is given to the author and his assigns, except when the work is executed for or on behalf of any other person for a good or valuable consideration.[946] The author is the actual artist whose mind has created the work.[947] The giving of ideas and suggestions to another is not sufficient to constitute an author,[948] but, on the other hand, there might be an author who had done little or nothing of the manual work required in the execution. In Nottage v. Jackson the question of authorship in works of art was fully discussed. Brett, M. R., said: The author of a painting is the man who paints it, the author of a drawing is the man who draws it,... of a photograph the author is the person who effectively is as near as he can be the cause of the picture which is produced, that is, the person who has superintended the arrangement, who has actually formed [175]
  • 76. the picture by putting the people into position and arranging the place in which the people are to be—the man who is the effective cause of that. Although he may only have done it by standing in the room and giving orders about it, still it is his mind and act, as far as anybody's mind and act are concerned, which is the effective cause of the picture such as it is when it is produced. Cotton, L. J., in the same case, said: In my opinion 'author' involves originating, making, producing, as the inventive or master mind, the thing which is to be protected, whether it be a drawing or a painting or a photograph.... It is not the person who suggests the idea but the person who makes the painting or drawing who is the author. The Employer.—When an artistic work, protected by 25 26 Vict. c. 68, is executed by the author for or on behalf of any other person for a good or valuable consideration, the copyright vests in the employer and his assigns, unless it be expressly reserved to the author by agreement in writing signed by the employer.[949] This provision applies to the everyday case of a person employing and paying a painter or photographer to take his portrait. The copyright vests in the customer.[950] The case, however, is not always so simple. Difficult questions arise where the artist, usually a photographer, requests the sitter, probably an actress or athlete, to allow his portrait to be taken on the understanding that the artist may publish and sell copies.[951] The sitter probably receives free copies or copies at a reduced price. The difficulties to be solved are purely questions of fact in each case, viz.:
  • 77. 1. Was the portrait taken for or on behalf of some person other than the artist? 2. Did the artist receive good and valuable consideration? As a rule, where a photographer invites celebrities to sit for him, the understanding will be that the portrait is taken on the photographer's behalf;[952] but at the same interview some plates might be taken on behalf of the photographer and some on behalf of the sitter.[953] The valuable consideration received by the photographer need not be a money payment, but may consist merely in the right given to him to publish and sell copies.[954] When a managing director of a company employed A to make drawings for a trade catalogue, the letterpress of which he wrote himself, it was held that he was acting merely as agent for the company, and that as the drawings were made not on his behalf but on behalf of the company he was not the proprietor.[955] The Assignee.—Assignment is required to be by some note or memorandum in writing signed by the proprietor of the copyright or his agent appointed for that purpose in writing.[956] Registration is not necessary to effect assignment,[957] although the assignee must be registered before he can sue.[958] No particular words are required in an assignment,[959] but there must be a present grant and not only an executory contract.[960] Partial Assignment.—It is doubtful whether a copyright can be partially assigned, either limited as to a copying of a particular kind or limited as to place or time.[961] What is called by the parties an assignment may only amount to a licence. In Lucas v. Cooke[962] the proprietor of the copyright in a picture granted the following document to an engraver: I assign to you for the purposes of an [176]
  • 78. engraving of one size the copyright of the picture painted by Mr. E. V. Eddie, entitled Going to Work, and being a portrait of my daughter. Fry, J., said: The result of this instrument in my view was that after the preparation of the engraving and the registration, Mr. Lucas (the engraver) became the owner of the copyright of the print or engraving, and Mr. Halford remained the owner of the copyright of the painting. It was held that the engraver, in order to succeed against a copyist, would have to show that the alleged infringement was a copy of his engraving, another copy of the picture itself was no infringement of his rights. The transaction was a licence, and probably a licensee can never sue in his own name. In one case,[963] however, Mathew, J., held that a sole licensee for a limited time could sue, and did not require to be registered. The plaintiff had acquired from the proprietor of the copyright in a picture the sole right to reproduce it in chromo for two years. The defendants also produced a chromo of the picture taken directly from the picture and not from the plaintiff's chromo. Mathew, J., held that the plaintiff, as sole licensee, was entitled to prevent any one infringing his right, and that being a licensee and not an assignee, his name was not required to be on the register. This is a very doubtful decision. Section III.—Infringement. Prohibited Acts and Remedies.—The right given is the sole and exclusive right of copying, engraving, reproducing, and multiplying a painting or drawing and the design thereof, or a photograph and the negative thereof by any means and of any size.[964] It is an offence for the author having parted with the copyright, or for any other person not being the proprietor[965]— [177]
  • 79. 1. To repeat, copy, colourably imitate or otherwise multiply for sale, hire, exhibition, or distribution. 2. Knowingly to import into the United Kingdom, or sell, publish, let to hire, exhibit, or distribute, or offer for sale, hire, exhibition, or distribution any copy unlawfully made. And for any of the above offences an action lies at the instance of the proprietor for[966]— i. Sum not exceeding £10 on each copy made or dealt with. [967] ii. Forfeiture of copies to the proprietor.[968] iii. Inspection and account.[969] iv. Damages.[970] v. Injunction.[971] Penalties and forfeiture of copies may also be obtained by summary proceedings before any two justices having jurisdiction where the party offending resides.[972] It is further an offence— 3. Innocently to import or sell, publish, let to hire, exhibit, or distribute, or offer for sale, hire, exhibition, or distribution any copy made without the owner's consent. For any of which an action lies at the instance of the proprietor of the copyright for[973]— [178]
  • 80. Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com