Wu dis

MIMO-OFDM COMMUNICATION SYSTEMS:
CHANNEL ESTIMATION AND WIRELESS
LOCATION

A Dissertation

Submitted to the Graduate Faculty of the
Louisiana State University and
Agricultural and Mechanical College
in partial fulﬁllment of the
requirements for the degree of
Doctor of Philosophy

in

The Department of Electrical and Computer Engineering

by
Zhongshan Wu
B.S., Northeastern University, China, 1996
M.S., Louisiana State University, US, 2001
May 2006

Acknowledgments

Throughout my six years at LSU, I have many people to thank for helping to

make my experience here both enriching and rewarding.

First and foremost, I wish to thank my advisor and committee chair, Dr. Guoxiang

Gu. I am grateful to Dr. Gu for his oﬀering me such an invaluable chance to study

here, for his being a constant source of research ideas, insightful discussions and

inspiring words in times of needs and for his unique attitude of being strict with

academic research which will shape my career forever.

My heartful appreciation also goes to Dr. Kemin Zhou whose breadth of knowledge

and perspectiveness have instilled in me great interest in bridging theoretical research

and practical implementation. I would like to thank Dr. Shuangqing Wei for his fresh

talks in his seminar and his generous sharing research resource with us.

I am deeply indebted to Dr. John M. Tyler for his taking his time to serve as my

graduate committee member and his sincere encouragement. For providing me with

the mathematical knowledge and skills imperative to the work in this dissertation, I

would like to thank my minor professor, Dr. Peter Wolenski for his precious time.

For all my EE friends, Jianqiang He, Bin Fu, Nike Liu, Xiaobo Li, Rachinayani

iii

Kumar Phalguna and Shuguang Hao, I cherish all the wonderful time we have to-

gether.

Through it all, I owe the greatest debt to my parents and my sisters. Especially

my father, he will be living in my memory for endless time.

Zhongshan Wu

October, 2005

iv

Contents

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

List of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Notation and Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 OFDM System Model . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Dissertation Contributions . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . 27

2 MIMO-OFDM Channel Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2 System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.1 Signal Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.2 Preliminary Analysis . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Channel Estimation and Pilot-tone Design . . . . . . . . . . . . . . . 46
2.3.1 LS Channel Estimation . . . . . . . . . . . . . . . . . . . . . . 46
2.3.2 Pilot-tone Design . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.3.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . 53
2.4 An Illustrative Example and Concluding
Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4.1 Comparison With Known Result . . . . . . . . . . . . . . . . 54
2.4.2 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . 59

v

3 Wireless Location for OFDM-based Systems . . . . . . . . . . . . . . . . . . . . . . 62
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.1.1 Overview of WiMax . . . . . . . . . . . . . . . . . . . . . . . 62
3.1.2 Overview to Wireless Location System . . . . . . . . . . . . . 65
3.1.3 Review of Data Fusion Methods . . . . . . . . . . . . . . . . . 70
3.2 Least-square Location based on TDOA/AOA Estimates . . . . . . . . 78
3.2.1 Mathematical Preparations . . . . . . . . . . . . . . . . . . . 78
3.2.2 Location based on TDOA . . . . . . . . . . . . . . . . . . . . 83
3.2.3 Location based on AOA . . . . . . . . . . . . . . . . . . . . . 94
3.2.4 Location based on both TDOA and AOA . . . . . . . . . . . . 100
3.3 Constrained Least-square Optimization . . . . . . . . . . . . . . . . . 105
3.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

vi

List of Figures

1.1 Comparison between conventional FDM and OFDM . . . . . . . . . . 7

1.2 Graphical interpretation of OFDM concept . . . . . . . . . . . . . . . 9

1.3 Spectra of (a) an OFDM subchannel (b) an OFDM symbol . . . . . . 10

1.4 Preliminary concept of DFT . . . . . . . . . . . . . . . . . . . . . . . 11

1.5 Block diagram of a baseband OFDM transceiver . . . . . . . . . . . . 13

1.6 (a) Concept of CP; (b) OFDM symbol with cyclic extension . . . . . 16

2.1 Nt × Nr MIMO-OFDM System model . . . . . . . . . . . . . . . . . 34

2.2 The concept of pilot-based channel estimation . . . . . . . . . . . . . 43

2.3 Pilot placement with Nt = Nr = 2 . . . . . . . . . . . . . . . . . . . . 52

2.4 Symbol error rate versus SNR with Doppler shift=5 Hz . . . . . . . . 56

2.5 Symbol error rate versus SNR with Doppler shift=40 Hz . . . . . . . 57

2.6 Symbol error rate versus SNR with Doppler shift=200 Hz . . . . . . . 57

2.7 Normalized MSE of channel estimation based on optimal pilot-tone

design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

2.8 Normalized MSE of channel estimation based on preamble design . . 58

3.1 Network-based wireless location technology (outdoor environments) . 67

vii

3.2 TOA/TDOA data fusion using three BSs . . . . . . . . . . . . . . . . 70

3.3 AOA data fusion with two BSs . . . . . . . . . . . . . . . . . . . . . 74

3.4 Magnitude-based data fusion in WLAN networks . . . . . . . . . . . 77

3.5 Base stations and mobile user locations . . . . . . . . . . . . . . . . . 110

3.6 Location estimation with TDOA-only and AOA+TDOA data . . . . 112

3.7 Location estimation performance . . . . . . . . . . . . . . . . . . . . 113

3.8 Eﬀect of SNR on estimation accuracy . . . . . . . . . . . . . . . . . . 113

3.9 Outrage curve for location accuracy . . . . . . . . . . . . . . . . . . . 114

viii

Notation and Symbols

AM×N : M-row N-column matrix
A−1 : Inverse of A
Tr(A): Trace of A, Tr(A) = i Aii
AT : Transpose of A
A∗ : Complex conjugate transpose of A
IN : Identity matrix of size N × N

ix

List of Acronyms

MIMO multiple input and multiple outut
OFDM orthogonal frequency division multiplexing
LS least square
MS mobile station
TDOA time difference of arrival
AOA angle of arrival
WiMax worldwide interoperability for microwave access
ML maximum-likelihood
AWGN additive white Gaussian noise
WMAN wireless metropolitan area network
ICI inter-carrier interference
ISI inter-symbol interference
FFT fast Fourier transform
WLAN wireless local area network
CP cyclic prefix
BER bit error rate
MMSE minimum mean squared error
GPS global positioning system
WiFi wireless fidelity

x

Abstract

In this new information age, high data rate and strong reliability features our wire-

less communication systems and is becoming the dominant factor for a successful

deployment of commercial networks. MIMO-OFDM (multiple input multiple output-

orthogonal frequency division multiplexing), a new wireless broadband technology,

has gained great popularity for its capability of high rate transmission and its robust-

ness against multi-path fading and other channel impairments.

A major challenge to MIMO-OFDM systems is how to obtain the channel state in-

formation accurately and promptly for coherent detection of information symbols and

channel synchronization. In the ﬁrst part, this dissertation formulates the channel

estimation problem for MIMO-OFDM systems and proposes a pilot-tone based esti-

mation algorithm. A complex equivalent baseband MIMO-OFDM signal model is pre-

sented by matrix representation. By choosing L equally-spaced and equally-powered

pilot tones from N sub-carriers in one OFDM symbol, a down-sampled version of

the original signal model is obtained. Furthermore, this signal model is transformed

into a linear form solvable for the LS (least-square) estimation algorithm. Based on

the resultant model, a simple pilot-tone design is proposed in the form of a unitary

xi

matrix, whose rows stand for different pilot-tone sets in the frequency domain and

whose columns represent distinct transmit antennas in the spatial domain. From the

analysis and synthesis of the pilot-tone design in this dissertation, our estimation

algorithm can reduce the computational complexity inherited in MIMO systems by

the fact that the pilot-tone matrix is essentially a unitary matrix, and is proven an

optimal channel estimator in the sense of achieving the minimum MSE (mean squared

error) of channel estimation for a fixed power of pilot tones.

In the second part, this dissertation addresses the wireless location problem in

WiMax (worldwide interoperability for microwave access) networks, which is mainly

based on the MIMO-OFDM technology. From the measurement data of TDOA (time

difference of arrival), AOA (angle of arrival) or a combination of those two, a quasi-

linear form is formulated for an LS-type solution. It is assumed that the observation

data is corrupted by a zero-mean AWGN (additive white Gaussian noise) with a very

small variance. Under this assumption, the noise term in the quasi-liner form is proved

to hold a normal distribution approximately. Hence the ML (maximum-likelihood)

estimation and the LS-type solution are equivalent. But the ML estimation technique

is not feasible here due to its computational complexity and the possible nonexistence

of the optimal solution. Our proposed method is capable of estimating the MS loca-

tion very accurately with a much less amount of computations. A final result of the

MS (mobile station) location estimation, however, cannot be obtained directly from

the LS-type solution without bringing in another independent constraint. To solve

xii

this problem, the Lagrange multiplier is explored to ﬁnd the optimal solution to the

constrained LS-type optimization problem.

xiii

Chapter 1

Introduction

Wireless technologies have evolved remarkably since Guglielmo Marconi ﬁrst demon-

strated radio’s ability to provide continuous contact with ships sailing in the English

channel in 1897. New theories and applications of wireless technologies have been

developed by hundreds and thousands of scientists and engineers through the world

ever since. Wireless communications can be regarded as the most important devel-

opment that has an extremely wide range of applications from TV remote control

and cordless phones to cellular phones and satellite-based TV systems. It changed

people’s life style in every aspect. Especially during the last decade, the mobile radio

communications industry has grown by an exponentially increasing rate, fueled by

the digital and RF (radio frequency) circuits design, fabrication and integration tech-

niques and more computing power in chips. This trend will continue with an even

greater pace in the near future.

The advances and developments in the technique ﬁeld have partially helped to

realize our dreams on fast and reliable communicating “any time any where”. But we

1

2

are expecting to have more experience in this wireless world such as wireless Internet

surfing and interactive multimedia messaging so on. One natural question is: how

can we put high-rate data streams over radio links to satisfy our needs? New wireless

broadband access techniques are anticipated to answer this question. For example,

the coming 3G (third generation) cellular technology can provide us with up to 2Mbps

(bits per second) data service. But that still does not meet the data rate required by

multimedia media communications like HDTV (high-definition television) and video

conference. Recently MIMO-OFDM systems have gained considerable attentions from

the leading industry companies and the active academic community [28, 30, 42, 50].

A collection of problems including channel measurements and modeling, channel es-

timation, synchronization, IQ (in phase-quadrature)imbalance and PAPR (peak-to-

average power ratio) have been widely studied by researchers [48, 11, 14, 15, 13].

Clearly all the performance improvement and capacity increase are based on accurate

channel state information. Channel estimation plays a significant role for MIMO-

OFDM systems. For this reason, it is the first part of my dissertation to work on

channel estimation of MIMO-OFDM systems.

The maturing of MIMO-OFDM technology will lead it to a much wider variety of

applications. WMAN (wireless metropolitan area network) has adopted this technol-

ogy. Similar to current network-based wireless location technique [53], we consider the

wireless location problem on the WiMax network, which is based on MIMO-OFDM

technology. The work in this area contributes to the second part of my dissertation.

3

1.1 Overview

OFDM [5] is becoming a very popular multi-carrier modulation technique for trans-

mission of signals over wireless channels. It converts a frequency-selective fading

channel into a collection of parallel flat fading subchannels, which greatly simpli-

fies the structure of the receiver. The time domain waveform of the subcarriers are

orthogonal (subchannel and subcarrier will be used interchangeably hereinafter), yet

the signal spectral corresponding to different subcarriers overlap in frequency domain.

Hence, the available bandwidth is utilized very efficiently in OFDM systems without

causing the ICI (inter-carrier interference). By combining multiple low-data-rate sub-

carriers, OFDM systems can provide a composite high-data-rate with a long symbol

duration. That helps to eliminate the ISI (inter-symbol interference), which often

occurs along with signals of a short symbol duration in a multipath channel. Simply

speaking, we can list its pros and cons as follows [31].

Advantage of OFDM systems are:

• High spectral efficiency;

• Simple implementation by FFT (fast Fourier transform);

• Low receiver complexity;

• Robustability for high-data-rate transmission over multipath fading channel

• High flexibility in terms of link adaptation;

4

• Low complexity multiple access schemes such as orthogonal frequency division

multiple access.

Disadvantages of OFDM systems are:

• Sensitive to frequency offsets, timing errors and phase noise;

• Relatively higher peak-to-average power ratio compared to single carrier system,

which tends to reduce the power efficiency of the RF amplifier.

1.1.1 OFDM System Model

The OFDM technology is widely used in two types of working environments, i.e.,

a wired environment and a wireless environment. When used to transmit signals

through wires like twisted wire pairs and coaxial cables, it is usually called as DMT

(digital multi-tone). For instance, DMT is the core technology for all the xDSL

(digital subscriber lines) systems which provide high-speed data service via existing

telephone networks. However, in a wireless environment such as radio broadcasting

system and WLAN (wireless local area network), it is referred to as OFDM. Since we

aim at performance enhancement for wireless communication systems, we use the term

OFDM throughout this thesis. Furthermore, we only use the term MIMO-OFDM

while explicitly addressing the OFDM systems combined with multiple antennas at

both ends of a wireless link.

The history of OFDM can all the way date back to the mid 1960s, when Chang [2]

published a paper on the synthesis of bandlimited orthogonal signals for multichannel

5

data transmission. He presented a new principle of transmitting signals simultane-

ously over a bandlimited channel without the ICI and the ISI. Right after Chang’s

publication of his paper, Saltzburg [3] demonstrated the performance of the efficient

parallel data transmission systems in 1967, where he concluded that “the strategy

of designing an efficient parallel system should concentrate on reducing crosstalk be-

tween adjacent channels than on perfecting the individual channels themselves”. His

conclusion has been proven far-sighted today in the digital baseband signal processing

to battle the ICI.

Through the developments of OFDM technology, there are two remarkable con-

tributions to OFDM which transform the original “analog” multicarrier system to to-

day’s digitally implemented OFDM. The use of DFT (discrete Fourier transform) to

perform baseband modulation and demodulation was the first milestone when Wein-

stein and Ebert [4] published their paper in 1971. Their method eliminated the banks

of subcarrier oscillators and coherent demodulators required by frequency-division

multiplexing and hence reduced the cost of OFDM systems. Moreover, DFT-based

frequency-division multiplexing can be completely implemented in digital baseband,

not by bandpass filtering, for highly efficient processing. FFT, a fast algorithm for

computing DFT, can further reduce the number of arithmetic operations from N 2

to N logN (N is FFT size). Recent advances in VLSI (very large scale integration)

technology has made high-speed, large-size FFT chips commercially available. In We-

instein’s paper [4], they used a guard interval between consecutive symbols and the

6

raised-cosine windowing in the time-domain to combat the ISI and the ICI. But their

system could not keep perfect orthogonality between subcarriers over a time disper-

sive channel. This problem was first tackled by Peled and Ruiz [6] in 1980 with the

introduction of CP (cyclic prefix) or cyclic extension. They creatively filled the empty

guard interval with a cyclic extension of the OFDM symbol. If the length of CP is

longer than the impulse response of the channel, the ISI can be eliminated completely.

Furthermore, this effectively simulates a channel performing cyclic convolution which

implies orthogonality between subcarriers over a time dispersive channel. Though

this introduces an energy loss proportional to the length of CP when the CP part

in the received signal is removed, the zero ICI generally pays the loss. And it is the

second major contribution to OFDM systems.

With OFDM systems getting more popular applications, the requirements for a

better performance is becoming higher. Hence more research efforts are poured into

the investigation of OFDM systems. Pulse shaping [7, 8], at an interference point

view, is beneficial for OFDM systems since the spectrum of an OFDM signal can

be shaped to be more well-localized in frequency; Synchronization [9, 10, 11] in time

domain and in frequency domain renders OFDM systems robust against timing errors,

phase noise, sampling frequency errors and carrier frequency offsets; For coherent

detection, channel estimation [46, 49, 48] provides accurate channel state information

to enhance performance of OFDM systems; Various effective techniques are exploited

to reduce the relatively high PAPR [12, 13] such as clipping and peak windowing.

7

The principle of OFDM is to divide a single high-data-rate stream into a number of

lower rate streams that are transmitted simultaneously over some narrower subchan-

nels. Hence it is not only a modulation (frequency modulation) technique, but also

a multiplexing (frequency-division multiplexing) technique. Before we mathemati-

cally describe the transmitter-channel-receiver structure of OFDM systems, a couple

of graphical intuitions will make it much easier to understand how OFDM works.

OFDM starts with the “O”, i.e., orthogonal. That orthogonality differs OFDM from

conventional FDM (frequency-division multiplexing) and is the source where all the

advantages of OFDM come from. The difference between OFDM and conventional

FDM is illustrated in Figure 1.1.

Ch1 Ch2 Ch3 Ch4 Ch5
Power

(a) Frequency

Ch1 Ch2 Ch3 Ch4 Ch5
Saving of bandwidth
Power

(b) Frequency

Figure 1.1: Comparison between conventional FDM and OFDM

It can be seen from Figure 1.1, in order to implement the conventional parallel

data transmission by FDM, a guard band must be introduced between the different

8

carriers to eliminate the interchannel interference. This leads to an inefficient use

of the rare and expensive spectrum resource. Hence it stimulated the searching for

an FDM scheme with overlapping multicarrier modulation in the mid of 1960s. To

realize the overlapping multicarrier technique, however we need to get rid of the ICI,

which means that we need perfect orthogonality between the different modulated

carriers. The word “orthogonality” implies that there is a precise mathematical re-

lationship between the frequencies of the individual subcarriers in the system. In

OFDM systems, assume that the OFDM symbol period is Tsym , then the minimum

subcarrier spacing is 1/Tsym . By this strict mathematical constraint, the integration

of the product of the received signal and any one of the subcarriers fsub over one

symbol period Tsym will extract that subcarrier fsub only, because the integration of

the product of fsub and any other subcarriers over Tsym results zero. That indicates

no ICI in the OFDM system while achieving almost 50% bandwidth savings. In the

sense of multiplexing, we refer to Figure 1.2 to illustrate the concept of OFDM. Ev-

ery Tsym seconds, a total of N complex-valued numbers Sk from different QAM/PSK

(quadrature and amplitude modulation/phase shift keying) constellation points are

used to modulate N different complex carriers centered at frequency fk , 1 ≤ k ≤ N .

The composite signal is obtained by summing up all the N modulated carriers.

It is worth noting that OFDM achieves frequency-division multiplexing by base-

band processing rather than by bandpass filtering. Indeed, as shown in Figure 1.3,

the individual spectra has sinc shape. Even though they are not bandlimited, each

9

j 2 f 1t
e
s1(t) S1

j 2 f 2t
e
s2(t) S2

e j2 fNt

sN(t) SN

OFDM symbol:

Figure 1.2: Graphical interpretation of OFDM concept

10

subcarrier can still be separated from the others since orthogonality guarantees that

the interfering sincs have nulls at the frequency where the sinc of interest has a peak.
1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0

-0.2 -0.2

-0.4 -0.4
-10 -8 -6 -4 -2 0 2 4 6 8 10 -10 -5 0 5 10
(a) (b)

Figure 1.3: Spectra of (a) an OFDM subchannel (b) an OFDM symbol

The use of IDFT (inverse discrete Fourier transform), instead of local oscillators,

was an important breakthrough in the history of OFDM. It is an imperative part for

OFDM system today. It transforms the data from frequency domain to time domain.

Figure 1.4 shows the preliminary concept of DFT used in an OFDM system. When

the DFT of a time domain signal is computed, the frequency domain results are a

function of the sampling period T and the number of sample points N . The funda-

1
mental frequency of the DFT is equal to NT
(1/total sample time). Each frequency

represented in the DFT is an integer multiple of the fundamental frequency. The

maximum frequency that can be represented by a time domain signal sampled at rate

1 1
T
is fmax = 2T
as given by the Nyquist sampling theorem. This frequency is located

in the center of the DFT points. The IDFT performs exactly the opposite operation

to the DFT. It takes a signal deﬁned by frequency components and converts them to

a time domain signal. The time duration of the IDFT time signal is equal to N T . In

11

essence, IDFT and DFT is a reversable pair. It is not necessary to require that IDFT

be used in the transmitter side. It is perfectly valid to use DFT at transmitter and

then to use IDFT at receiver side.
s(t)

T t
sample period
NT

S(f)

0 1/NT 2/NT 2/T (N-1)/NT f

Figure 1.4: Preliminary concept of DFT

After the graphical description of the basic principles of OFDM such as orthogo-

nality, frequency modulation and multiplexing and use of DFT in baseband process-

ing, it is a time to look in more details at the signals ﬂowing between the blocks of

an OFDM system and their mathematical relations. At this point, we employ the

following assumptions for the OFDM system we consider.

• a CP is used;

• the channel impulse response is shorter than the CP, in terms of their respective

length;

12

• there is perfect synchronization between the transmitter and the receiver;

• channel nosise is additive, white and complex Gaussian;

• the fading is slowing enough for the channel to be considered constant during

the transmission of one OFDM symbol.

For a tractable analysis of OFDM systems, we take a common practice to use the

simplified mathematical model. Though the first OFDM system was implemented by

analogue technology, here we choose to investigate a discrete-time model of OFDM

step by step since digital baseband synthesis is widely exploited for today’s OFDM

systems. Figure 1.5 shows a block diagram of a baseband OFDM modem which is

based on PHY (physical layer) of IEEE standard 802.11a [37].

Before describing the mathematical model, we define the symbols and notations

used in this dissertation. Capital and lower-case letters denote signals in frequency

domain and in time domain respectively. Arrow bar indicates a vector and boldface

letter without an arrow bar represents a matrix. It is packed into a table as follows.

Binary
input u (m)
data Channel Inter- QAM Pilot S (m) s (m)
P/S Add
coding leaving mapping insertion S/P DAC RF TX
CP

channel
IFFT (TX)

FFT (RX)
y (m) r (m)
Deinter- De Remove
Decoding Detection P/S S/P ADC RF RX
leaving mapping Y (m) CP
Binary
output
data

Channel Timing and
estimation Synch.

Figure 1.5: Block diagram of a baseband OFDM transceiver
13

14

Ap×q p × q matrix
a column vector
Ip p × p identity matrix
0 zero matrix
diag(a) diagonal matrix with a’s elements on the diagonal
¯
AT transpose of A
A∗ complex conjugate of A
AH Hermitian of A
tr(A) trace of A
rank(A) rank of A
det(A) determinant of A
A⊗B Kronecker product of A and B

As shown in Figure 1.5, the input serial binary data will be processed by a data

scrambler ﬁrst and then channel coding is applied to the input data to improve the

BER (bit error rate) performance of the system. The encoded data stream is fur-

ther interleaved to reduce the burst symbol error rate. Dependent on the channel

condition like fading, diﬀerent base modulation modes such as BPSK (binary phase

shift keying), QPSK (quadrature phase shift keying) and QAM are adaptively used

to boost the data rate. The modulation mode can be changed even during the trans-

mission of data frames. The resulting complex numbers are grouped into column

vectors which have the same number of elements as the FFT size, N . For simplicity

of presentation and ease of understanding, we choose to use matrix and vector to

describe the mathematical model. Let S(m) represent the m-th OFDM symbol in

15

the frequency domain, i.e.,
 
 S(mN ) 
 . 
S(m) =  .
.  ,
 
 
S(mN + N − 1)
N ×1

where m is the index of OFDM symbols. We assume that the complex-valued elements

{S(mN ), S(mN + 1), . . . , S(mN + N − 1)} of S(m) are zero mean and uncorrelated

random variables whose sample space is the signal constellation of the base modula-

tion (BPSK, QPSK and QAM). To achieve the same average power for all diﬀerent

mappings, a normalization factor KMOD [37] is multiplied to each elements of S(m)

such that the average power of the diﬀerent mappings is normalized to unity. To

obtain the time domain samples, as shown by the IDFT block in Figure 1.5, an IFFT

(inverse fast Fourier transform) operation is represented by a matrix multiplication.
2π
Let FN be the N -point DFT matrix whose (p, q)-th elements is e−j N (p−1)(q−1) . The

resulting time domain samples s(m) can be described by
 
 s(mN ) 
 
 .
. 
s(m) = 
 . 

  (1.1)
s(mN + N − 1)
N ×1
1
= ( N )FH S(m).
N

Compared to the costly and complicated modulation and multiplexing of conventional

FDM systems, OFDM systems easily implement them by using FFT in baseband pro-

cessing. To combat the multipath delay spread in wireless channels, the time-domain

samples s(m) is cyclically extended by copying the last Ng samples and pasting them

to the front, as shown in Figure 1.6(a) [6].

16

N
Ng

CP

guard time FFT integration time
(CP)
(a) (b)

Figure 1.6: (a) Concept of CP; (b) OFDM symbol with cyclic extension

Let u(m) denote the cyclically extended OFDM symbol as
 
u(mNtot )  
 
 
 .
.   CP 
u(m) = 
 . =

 ,
  s(m)
u(mNtot + Ntot − 1) Ntot ×1

where Ntot = N + Ng is the length of u(m). In the form of matrix, the CP insertion

can be readily expressed as a matrix product of s(m) and an Ntot × N matrix ACP .

By straight computation, it holds that

u(m) = ACP s(m), (1.2)

where  
 0 INg 
 
ACP =  IN −Ng


0  .
 
 
0 INg
(N +Ng )×N

One of the challenges from the harsh wireless channels is the multipath delay spread.

If the delay spread is relatively large compared to the symbol duration, then a delayed

copy of a previous symbol will overlap the current one which implies severe ISI. To

17

eliminate the ISI almost completely, a CP is introduced for each OFDM symbol and

the length of CP, Ng must be chosen longer than the experienced delay spread, L, i.e.,

Ng ≥ L. In addition, CP is capable of maintaining the orthogonality among subcarri-

ers which implies zero ICI. It is because the OFDM symbol is cyclically extended and

this ensures that the delayed replicas of the OFDM symbol always have an integer

number of cycles within the FFT interval, as long as the delay is smaller than the CP.

It is clearly illustrated in Figure 1.6(b). No matter where the FFT window starts,

provided that it is within the CP, there will be always one or two complete cycles

within FFT integration time for the symbol on top and at below respectively. In IEEE

802.11a standard [37], Ng is at least 16. The obtained OFDM symbol (including the

CP) u(m), as shown in Figure 1.5, must be converted to the analogue domain by an

DAC (digital-to-analog converter) and then up-converted for RF transmission since it

is currently not practical to generate the OFDM symbol directly at RF rates. To re-

main in the discrete-time domain, the OFDM symbol could be up-sampled and added

to a discrete carrier frequency. This carrier could be an IF (intermediate frequency)

whose sample rate is handled by current technology. It could then be converted to

analog and increased to the ﬁnal transmit frequency using analog frequency conver-

sion methods. Alternatively, the OFDM modulation could be immediately converted

to analog and directly increased to the desired RF transmit frequency. Either way has

its advantages and disadvantages. Cost, power consumption and complexity must be

taken into consideration for the selected technique.

18

The RF signal is transmitted over the air. For the wireless channel, it is assumed

in this thesis as a quasi-static frequency-selective Rayleigh fading channel [71]. It

indicates that the channel remains constant during the transmission of one OFDM

symbol. Suppose that the multipath channel can be modeled by a discrete-time

baseband equivalent (L−1)th-order FIR (finite impulse response) filter with filter taps

{h0 , h1 , . . . , hl , . . . , hL−1 }. It is further assumed that the channel impulse response,i.e.,

the equivalent FIR filter taps, are independent zero mean complex Gaussian random

variables with variance of 1 Pl per dimension. The ensemble of {P0 , . . . , Pl , . . . , PL−1 }
2

is the PDP (power delay profile) of the channel and usually the total power of the

PDP is normalized to be 1 as the unit average channel attenuation. Denote the CIR

(channel impulse response) vector hm as
 
 h0,m 
 
 .
. 
hm = 
 . 

,
 
hL−1,m
L×1

where the subscript m is kept to imply that the channel may vary from one OFDM

symbol to the next one. Then the complex baseband equivalent received signal can

be represented by a discrete-time convolution as
L−1
r(mNtot + n) = hl,m u(mNtot + n − l) + v(mNtot + n), (1.3)
l=0

where mNtot + n means the n-th received sample during the m-th OFDM symbol

and 0 ≤ n ≤ Ntot − 1. The term v(mNtot + n) represents the complex AWGN at

1 2
the (mNtot + n)-th time sample with zero mean and variance of 2 σv per dimension.

1
Hence, the expected SNR (signal-to-noise ratio) per received signal is ρ = σv2. In

19

order for the parallel processing by the DFT block in Figure 1.5, we will rewrite the

equation (1.3) into a matrix form. First we define

   
 r(mNtot )   v(mNtot ) 
   
 .
.   .
. 
r(m) = 
 . ;

v(m) = 
 . ,

(1.4)
   
r(mNtot + Ntot − 1) v(mNtot + Ntot − 1)

and
   
h0,m hL−1,m ··· h1,m
 . ..   . 
 .
. .   .. . 
   . . 
 
 ; hm,T oep =  .
(c)
hm,T oep =  hL−1,m ··· h0,m  hL−1,m  (1.5)
   
 .. .
. ..   
 . . . 
hL−1,m ··· h0,m

Then it is straight forward to have the following input-output relationship with regard

to the channel

(c)
r(m) = hm,T oep u(m) + hm,T oep u(m − 1) + v(m). (1.6)

It is easy to see in (1.6) that the first L−1 terms of r(m), i.e., {r(mNtot ), . . . , r(mNtot +
(c)
L − 2)}, will be affected by the ISI term hm,T oep u(m − 1) since the Toeplitz and upper
(c)
triangular matrix hm,T oep has non-zero entries in the first L − 1 rows. In order to

remove the ISI term, we transform the Ntot × 1 vector r(m) into an N × 1 vector

y(m) by simply cutting off the first Ng possibly ISI affected elements. For complete

elimination of ISI, Ng ≥ L must be satisfied. It is a reverse operation of the cyclic

extension as implemented in the transmitter side. Consistently this transformation

20

can also be expresses as matrix-vector product
 
 y(mN ) 
 
.
y(m) = 
 .
.

 = ADeCP r(m) , (1.7)
 
 
y(mN + N − 1)
where

ADeCP = 0 IN .
N ×Ntot

As shown in Figure 1.5, the ISI-free received signal y(m) is demodulated by FFT

and hence it is converted back to the frequency domain received signal Y (m). It is

described by  
 Y (mN ) 
 
 .
. 
Y (m) = 
 .  = FN y(m) .

(1.8)
 
Y (mN + N − 1)
After obtaining the received signal Y (m), symbol detection can be implemented if the

channel state information is known or it can be estimated by some channel estimation

algorithms. The detected symbol will pass through a series of reverse operations to

retrieve the input binary information, corresponding to the encoding, interleaving

and mapping in the transmitter side. Following the signal ﬂow from the transmitted

signal S(m) to the receive signal Y (m), a simple relationship between them can be

expressed as

Y (m) = Hm,diag S(m) + V (m), (1.9)

where the diagonal matrix Hm,diag is
 
 H0,m 
 

Hm,diag =  .. 
 ; Hk,m = L−1 2π
hl e−j N kl , 0 ≤ k ≤ N.

.  l=0
 
HN −1,m

21

and V (m) is the complex AWGN in frequency domain. This simple transmitter-

and-receiver structure is well known in all the literatures [42, 46, 48, 49] and it is

an important reason for the wide application of OFDM systems. The transmitted

signal can be easily extracted by simply dividing the channel frequency response for

the specific subcarrier. Hence it eliminates the needs of a complicated equalizer at

the receive side. In this thesis, we do not directly jump on this known conclusion

for two reasons. First, following through the baseband block diagram in Figure 1.5,

we use a matrix form of presentation to describe all the input-output relationship

with respect to each block. This gives us a clear and thorough understanding of all

the signal processing within the OFDM system. It is a different view from those in

literatures which can be summarized by the fact that the discrete Fourier transform

of a cyclic convolution (IDFT(S(m)) and hm ) in time domain leads to a product of

the frequency responses (S(m) and DFT(hm )) of the two convoluted terms. Second,

this provides a base for our channel estimator design in the following chapter. Next,

the simple relation in (1.9) is shown by going through the signal flow backwards from

22

Y (m) to S(m) that

Y (m) = FN y(m)
= FN (ADeCP r(m))
(c)
= FN {ADeCP [hm,T oep u(m) + hm,T oep u(m − 1) + v(m)]}
= FN [ADeCP hm,T oep u(m) + ADeCP v(m)]
, (1.10)
= FN [ADeCP hm,T oep ACP s(m) + ADeCP v(m)]
1
= FN [ADeCP hm,T oep ACP ( N )FH S(m) + ADeCP v(m)]
N
1
= F [ADeCP hm,T oep ACP ]FH S(m)
N N N + FN (ADeCP v(m))
1
= N
[FN hCir FH ]S(m)
N + V (m)

where V (m) = FN (ADeCP v(m)) and hCir = ADeCP hm,T oep ACP is an N × N circulant
matrix with some special properties. It is parameterized as
 
 h0,m 0 ··· ··· 0 hL−1,m hL−2,m · · · h1,m 
 
 h1,m h0,m 0 ··· 0 0 hL−1,m · · · h2,m 
 
 
 .
. .
. .. . . . . .. . 

 . . . .
. .
. .
. .
. .
. . 

 
 hL−2,m · · · · · · h0,m 0 ··· ··· 0 hL−1,m 
 
 
 
hm,Cir =  hL−1,m · · · ··· ··· h0,m 0 ··· ··· 0  .
 
 
 0 hL−1,m · · · ··· ··· h0,m 0 ··· 0 
 
 . . . . . . 
 . .. .. . . . .. . . 
 . . . . . . . . . 
 
 . .. .. .. . . .. .. . . 
 .
. . . . .
. .
. . . . 
 
 
0 ··· ··· 0 hL−1,m hL−2,m · · · · · · h0,m
N ×N
(1.11)

As stated in [38], an N × N circulant matrix has some important properties:

• All the N × N circulant matrices have the same eigenvectors and they are the

H
columns of FN , where FN is the N -point FFT matrix;

• The corresponding eigenvalues {λ1 , · · · , λN } are the FFT of the ﬁrst column of

the circulant matrix;

23

The first column of the circulant matrix hm,Cir is [hT , . . . , hT
0,m
T
L−1,m , 0, . . . , 0] . Hence,

the eigenvalues of hm,Cir is
   
 H0,m   h0,m 
   . 

 H1,m



 .
. 

  = FN  .
 .   
 .
.   h 
   L−1,m 
   
HN −1,m 0(N −L)×1

Taking eigenvalue decomposition of hm,Cir , we have
 
 H0,m 
1  
hm,Cir = FH  .. 
 FN . (1.12)
 .
N N


HN −1,m

Simply substituting (1.12) into (1.10) shows that (1.9) is true.

The simple model in (1.9) is widely exploited for theoretical research. It is, however,

based on all of the assumptions we make at the beginning of this section. In the

practical OFDM systems, a lot of efforts were made in research to keep the OFDM

systems as close to this model as possible. Perfect synchronization in time domain

and frequency domain is the most challenging subject. The orthogonality could be

easily destroyed by a few factors such as the Doppler shift resulting from the relative

movement between the transmitter and the receiver, the frequency mismatch between

the oscillators at two ends, large timing errors and phase noise. Meanwhile, accurate

channel state information is critical for reducing the BER and improving the system

performance. Hence, joint channel estimation and synchronization with low complex-

ity is an active research area for current OFDM systems. As long as the orthogonality

is obtained, OFDM is a simple and efficient multicarrier data transmission technique.

24

1.2 Dissertation Contributions

In the first part, this dissertation addresses one of the most fundamental problems in

MIMO-OFDM communication system design, i.e., the fast and reliable channel esti-

mation. By using the pilot symbols, a MIMO-OFDM channel estimator is proposed

in this dissertation which is capable of estimating the time-dispersive and frequency-

selective fading channel. Our contribution to this dissertation are as follows.

• Great Simplicity:

For an Nt ×Nr MIMO (Nt : number of transmit antennas,Nr : number of receive

antennas) system, the complexity of any kinds of signal processing algorithms

at the physical layer is increased usually by a factor of Nt Nr . Hence, simplicity

plays an important role in the system design. We propose a pilot tone design

for MIMO-OFDM channel estimation that Nt disjoint set of pilot tones are

placed on one OFDM block at each transmit antenna. For each pilot tone set,

it has L (L: channel length) pilot tones which are equally-spaced and equally-

powered. The pilot tones from different transmit antennas comprise a unitary

matrix and then a simple least square estimation of the MIMO channel is easily

implemented by taking advantage of the unitarity of the pilot tone matrix.

There is no need to compute the inverse of large-size matrix which is usually

required by LS algorithm. Contrast to some other simplified channel estimation

methods by assuming that there are only a few dominant paths among L of them

25

and then neglecting the rest weaker paths in the channel, our method estimates

the full channel information with a reduced complexity.

• Estimation of Fast Time-varying Channel:

In a highly mobile environment, like a mobile user in a vehicle riding at more

than 100km/hr, the wireless channel may change within one or a small number

of symbols. But the information packet could contain hundreds of data symbols

or even more. In the literature [50] there are some preamble designs that the

wireless channel is only estimated at the preamble part of a whole data packet

and is assumed to be constant during the transmission of the rest data part.

Diﬀerent from the preamble design, our scheme is proposed that we distribute

the pilot symbols in the preamble to each OFDM block for channel estimation.

Since the pilot tones are placed on each OFDM block, the channel state infor-

mation can be estimated accurately and quickly, no matter how fast the channel

condition is varying.

• Link to SFC (Space-frequency code):

Usually channel estimation and space-frequency code design of MIMO-OFDM

systems are taken as two independent subject, especially for those algorithms

generalized from their counterparts in the SISO (single-input single-output)

case. Some researchers [48, 50] propose some orthogonal structures for pilot

tone design and try to reduce the complexity of computing. However, each

26

individual structure is isolated and it is not easy to generalize their structures to

the MIMO system with any number of transmit antennas and receive antennas.

In this dissertation, the orthogonal pilot tone matrix we propose is indeed a

space-frequency code. The row direction of the matrix stands for different pilot

tone sets in the frequency domain, and the column direction represents the

individual transmit antennas in spatial domain. And it can be readily extended

to an Nt × Nr MIMO system by constructing an Nt × Nt orthogonal matrix.

With this explicit relation to space-frequency code, the design of pilot-tone

matrix for MIMO-OFDM channel estimation can be conducted in a more broad

perspective. This link will shed light on each other.

In the second part of this dissertation, we contribute to the formulation of the lo-

cation estimation into a constrained LS-type optimization problem. As surveyed in

[53], there are different methods for location estimation based on measurements of

TOA, TDOA, AOA and amplitude. There are two problems which are not given full

attention and may increase the complexity of the algorithm. One problem is that only

an intermediate solution can be first obtained by solving the LS estimation problem.

It means that the intermediate solution is still a function of the unknown target loca-

tion. Extra constraints are needed to get the final target estimation. Though such a

constraint exists, solving the quadratic equation may end up with nonexistence of a

real positive root. Another problem is that it is unclear how the measurement noise

variance affect the estimation accuracy. Intuitively, a small variance is always pre-

27

ferred. In our proposed algorithm, the constrained LS-type optimization problem is

solved by using Lagrange multiplier. And it is pointed out that the noise variance is

closely related to the equivalent SNR. For example, in the case of TDOA, the equiva-

lent SNR is the ratio of the time for a signal traveling from the target to the k-th base

station over the noise variance. A smaller noise variance then indicates a higher SNR

which leads to more accurate location estimation. The formulation of a constrained

LS-type optimization has its advantages. First it holds a performance which is close

to the ML algorithm, provided that the assumption about the measurement noise

variance is satisfied. Second it inherits the simplicity from the LS algorithm.

1.3 Organization of the Dissertation

This dissertation is organized as follows. In Chapter 1, the principle of OFDM is

illustrated through instructive figures and the signal mode of OFDM systems is de-

scribed by matrix representation in details. Also, a review of research on channel

estimation for OFDM systems is covered in Chapter 1. In Chapter 2, it is mainly

focused on the pilot tone based channel estimation of MIMO-OFDM systems. It

ends up with intensive computer simulations of different estimation algorithms and

effects of some key OFDM parameters on estimator performance. Chapter 3 devotes

to wireless location on WiMax network. A constrained LS-type optimization problem

is formulated under a mild assumption and it is solved by using Lagrange multiplier

method. Finally this dissertation is summarized in Chapter 5 by suggesting some

open research subjects on the way.

Chapter 2

MIMO-OFDM Channel Estimation

2.1 Introduction

With the ever increasing number of wireless subscribers and their seemingly “greedy”

demands for high-data-rate services, radio spectrum becomes an extremely rare and

invaluable resource for all the countries in the world. Eﬃcient use of radio spectrum

requires that modulated carriers be placed as close as possible without causing any

ICI and be capable of carrying as many bits as possible. Optimally, the bandwidth of

each carrier would be adjacent to its neighbors, so there would be no wasted bands.

In practice, a guard band must be placed between neighboring carriers to provide

a guard space where a shaping ﬁlter can attenuate a neighboring carrier’s signal.

These guard bands are waste of spectrum. In order to transmit high-rate data, short

symbol periods must be used. The symbol period Tsym is the inverse of the baseband

data rate R (R = 1/Tsym ), so as R increases, Tsym must decrease. In a multipath

environment, however, a shorter symbol period leads to an increased degree of ISI,

and thus performance loss. OFDM addresses both of the two problems with its

28

29

unique modulation and multiplexing technique. OFDM divides the high-rate stream

into parallel lower rate data and hence prolongs the symbol duration, thus helping

to eliminate ISI. It also allows the bandwidth of subcarriers to overlap without ICI

as long as the modulated carriers are orthogonal. OFDM therefore is considered as

a good candidate modulation technique for broadband access in a very dispersive

environments [42, 43].

However, relying solely on OFDM technology to improve the spectral eﬃciency

gives us only a partial solution. At the end of 1990s, seminal work by Foshini and

Gans [21] and, independently, by Teltar [22] showed that there is another alternative

to accomplish high-data-rate over wireless channels: the use of multiple antennas

at the both ends of the wireless link, often referred to as MA (multiple antenna) or

MIMO in the literature [21, 22, 17, 16, 25, 26]. The MIMO technique does not require

any bandwidth expansions or any extra transmission power. Therefore, it provides a

promising means to increase the spectral eﬃciency of a system. In his paper about

the capacity of multi-antenna Gaussian channels [22], Telatar showed that given a

wireless system employing Nt TX (transmit) antennas and Nr RX (receive) anten-

nas, the maximum data rate at which error-free transmission over a fading channel

is theoretically possible is proportional to the minimum of Nt and Nr (provided that

the Nt Nr transmission paths between the TX and RX antennas are statistically in-

dependent). Hence huge throughput gains may be achieved by adopting Nt × Nr

MIMO systems compared to conventional 1 × 1 systems that use single antenna at

30

both ends of the link with the same requirement of power and bandwidth. With

multiple antennas, a new domain,namely, the spatial domain is explored, as opposed

to the existing systems in which the time and frequency domain are utilized.

Now let’s come back to the previous question: what can be done in order to en-

hance the data rate of a wireless communication systems? The combination of MIMO

systems with OFDM technology provides a promising candidate for next generation

ﬁxed and mobile wireless systems [42]. In practice for coherent detection, however,

accurate channel state information in terms of channel impulse response (CIR) or

channel frequency response (CFR) is critical to guarantee the diversity gains and the

projected increase in data rate.

The channel state information can be obtained through two types of methods.

One is called blind channel estimation [44, 45, 46], which explores the statistical in-

formation of the channel and certain properties of the transmitted signals. The other

is called training-based channel estimation, which is based on the training data sent

at the transmitter and known a priori at the receiver. Though the former has its

advantage in that it has no overhead loss, it is only applicable to slowly time-varying

channels due to its need for a long data record. Our work in this thesis focuses on

the training-based channel estimation method, since we aim at mobile wireless ap-

plications where the channels are fast time-varying. The conventional training-based

method [47, 48, 50] is used to estimate the channel by sending ﬁrst a sequence of

OFDM symbols, so-called preamble which is composed of known training symbols.

31

Then the channel state information is estimated based on the received signals cor-

responding to the known training OFDM symbols prior to any data transmission in

a packet. The channel is hence assumed to be constant before the next sequence of

training OFDM symbols. A drastic performance degradation then arises if applied to

fast time-varying channels. In [49], optimal pilot-tone selection and placement were

presented to aid channel estimation of single-input/single-output (SISO) systems. To

use a set of pilot-tones within each OFDM block, not a sequence of training blocks

ahead of a data packet to estimate the time-varying channel is the idea behind our

work. However direct generalization of the channel estimation algorithm in [49] to

MIMO-OFDM systems involves the inversion of a high-dimension matrix [47] due to

the increased number of transmit and receive antennas, and thus entails high complex-

ity and makes it infeasible for wireless communications over highly mobile channels.

This becomes a bottleneck for applications to broadband wireless communications.

To design a low-complexity channel estimator with comparable accuracy is the goal

of this chapter.

The bottleneck problem of complexity for channel estimation in MIMO-OFDM

systems has been studied by two different approaches. The first one shortens the

sequence of training symbols to the length of the MIMO channel, as described in [50],

leading to orthogonal structure for preamble design. Its drawback lies in the increase

of the overhead due to the extra training OFDM blocks. The second one is the simpli-

fied channel estimation algorithm, as proposed in [48], that achieves optimum channel

32

estimation and also avoids the matrix inversion. However its construction of the pilot-

tones is not explicit in terms of space-time codes (STC). We are motivated by both

approaches in searching for new pilot-tone design. Our contribution in this chapter

is the unification of the known results of [48, 50] in that the simplified channel esti-

mation algorithm is generalized to explicit orthogonal space-frequency codes (SFC)

that inherit the same computational advantage as in [48, 50], while eliminating their

respective drawbacks. In addition, the drastic performance degradation occurred in

[48, 50] is avoided by our pilot-tone design since the channel is estimated at each block.

In fact we have formulated the channel estimation problem in frequency domain, and

the CFR is parameterized by the pilot-tones in a convenient form for design of SFC.

As a result a unitary matrix, composed of pilot-tones from each transmit antenna,

can be readily constructed. It is interesting to observe that the LS algorithm based

on SFC in this paper is parallel to that for conventional OFDM systems with single

transmit/receive antenna. The use of multiple transmit/receive antennas offers more

design freedom that provides further improvements on estimation performance.

2.2 System Description

The block diagram of a MIMO-OFDM system [27, 28] is shown in Figure 2.1. Ba-

sically, the MIMO-OFDM transmitter has Nt parallel transmission paths which are

very similar to the single antenna OFDM system, each branch performing serial-to-

parallel conversion, pilot insertion, N -point IFFT and cyclic extension before the

final TX signals are up-converted to RF and transmitted. It is worth noting that

33

the channel encoder and the digital modulation, in some spatial multiplexing systems

[28, 29], can also be done per branch, not necessarily implemented jointly over all the

Nt branches. The receiver first must estimate and correct the possible symbol timing

error and frequency offsets, e.g., by using some training symbols in the preamble as

standardized in [37]. Subsequently, the CP is removed and N -point FFT is performed

per receiver branch. In this thesis, the channel estimation algorithm we proposed is

based on single carrier processing that implies MIMO detection has to be done per

OFDM subcarrier. Therefore, the received signals of subcarrier k are routed to the k-

th MIMO detector to recover all the Nt data signals transmitted on that subcarrier.

Next, the transmitted symbol per TX antenna is combined and outputted for the

subsequent operations like digital demodulation and decoding. Finally all the input

binary data are recovered with certain BER.

As a MIMO signalling technique, Nt different signals are transmitted simultane-

ously over Nt × Nr transmission paths and each of those Nr received signals is a

combination of all the Nt transmitted signals and the distorting noise. It brings in

the diversity gain for enhanced system capacity as we desire. Meanwhile compared

to the SISO system, it complicates the system design regarding to channel estimation

and symbol detection due to the hugely increased number of channel coefficients.

2.2.1 Signal Model

To find the signal model of MIMO-OFDM system, we can follow the same approach

as utilized in the SISO case. Because of the increased number of antennas, the signal

CP

1 1
1

P/S

S/P
IFFT
Data Channel Digital MIMO
source encoder modulator encoder
CP

Nt Nr
Nt P/S

S/P
Timing and Frequency IFFT
Synchronization De-CP

1
S/P

P/S
FFT

Data Channel Digital MIMO
sink decoder demodulator decoder De-CP

Figure 2.1: Nt × Nr MIMO-OFDM System model
Nr
S/P

P/S
FFT

Channel estimation
34

35

dimension is changed. For instance, the transmitted signal on the k-th subcarrier in

a MIMO system is an Nt × 1 vector, instead of a scalar in the SISO case. For brevity

of presentation, the same notations are used for both the SISO and MIMO cases. But

they are explicitly deﬁned in each case. There are Nt transmit antennas and hence

on each of the N subcarriers, Nt modulated signals are transmitted simultaneously.

Denote S(m) and S(mN + k) as the m-th modulated OFDM symbol in frequency

domain and the k-th modulated subcarrier respectively as
   
 S(mN )   S1 (mN + k) 
   
. .
S(m) = 
 .
.

 S(mN + k) = 
 .
.

, (2.1)
   
   
S(mN + N − 1) SNt (mN + k)

where Sj (mN + k) represents the k-th modulated subcarrier for the m-th OFDM

symbol transmitted by the j-th antenna. And it is normalized by a normalization

factor KMOD so that there is a unit normalized average power for all the mappings.

Taking IFFT of S(m) as a baseband modulation, the resulting time-domain samples

can be expressed as
   
 s(mN )   s1 (mN + n) 
   
. .
s(m) = 
 .
.

 s(mN + n) = 
 .
.


    (2.2)
   
s(mN + N − 1) sNt (mN + n)
1
= N
(FH
N ⊗ INt )S(m) .

Here IFFT is a block-wise operation since each modulated subcarrier is a column

vector and the generalized N Nt -point IFFT matrix is a Kronecker product of FN and

INt . This is just a mathematical expression. In the real OFDM systems, however,

the generalized IFFT operation is still performed by Nt parallel N -point IFFT. To

36

eliminate the ISI and the ICI, a length-Ng (Ng ≥ L) CP is prepended to the time-

domain samples per branch. The resulting OFDM symbol u(m) is denoted as
   
 u(mNtot )   u1 (mNtot + n) 
   
 .
.   .
. 
u(m) =  .  u(mNtot + n) =  . . (2.3)
   
   
u(mNtot + Ntot − 1) uNt (mNtot + n)

In a matrix form, there holds

u(m) = ACP s(m), (2.4)

where  
 0 INg 
 
 
ACP =  IN −Ng
 0  ⊗ INt .

 
0 INg
The time-domain samples denoted by u(m) may be directly converted to RF for

transmission or be up-converted to IF first and then transmitted over the wireless

MIMO channel. For the MIMO channel, we assume in this thesis that the MIMO-

OFDM system is operating in a frequency-selective Rayleigh fading environment and

that the communication channel remains constant during a frame transmission, i.e.,

quasi-static fading. Suppose that the channel impulse response can be recorded with

L time instances, i.e., time samples, then the multipath fading channel between the

j-th TX and i-th RX antenna can be modeled by a discrete-time complex base-

band equivalent (L − 1)-th order FIR filter with filter coefficients hij (l, m), with

l ⊆ {0, . . . , L − 1} and integer m > 0. As assumed in SISO case, these CIR coef-

ficients {hij (0, m), . . . , hij (L − 1, m)} are independent complex zero-mean Gaussian

1
RV’s with variance 2 Pl per dimension. The total power of the channel power delay

37

2
proﬁle {P0 , . . . , PL−1 } is normalized to be σc = 1. Let hm be the CIR matrix and

denote hl,m as the l-th matrix-valued CIR coeﬃcient.
   
 h0,m   h11 (l, m) ··· h1Nt (l, m) 
   
 .
.   .
. ... .
. 
hm = 
 . ;

hl,m = 
 . . .

(2.5)
   
hL−1,m hNr 1 (l, m) · · · hNr Nt (l, m)

In addition, we assume that those Nt Nr geographically co-located multipath channels

are independent in an environments full of scattering. In information-theoretic point

of view [21, 22], it guarantees the capacity gain of MIMO systems. For the practical

MIMO-OFDM systems, it enforces a lower limit on the shortest distance between

multiple antennas at a portable receiver unit. If the correlation between those chan-

nels exists, the diversity gain from MIMO system will be reduced and hence system

performance is degraded.

At the receive side, an Nr -dimensional complex baseband equivalent receive signal

can be obtained by a matrix-based discrete-time convolution as

L−1
r(mNtot + n) = hl,m u(mNtot + n − l) + v(mNtot + n), (2.6)
l=0

where
   
 r1 (mNtot + n)   v1 (mNtot + n) 
   
. .
r(mNtot + n) = 
 .
.

 v(mNtot + n) = 
 .
.

 .
   
   
rNr (mNtot + n) vNr (mNtot + n)

Note that vi (mNtot +n) is assumed to be complex AWGN with zero mean and variance

of 1 σv per dimension. Therefore, the expected signal-to-noise ratio (SNR) per receive
2
2

Nt
antenna is 2.
σv
In order to have a fair comparison with SISO systems, the power

38

per TX antenna should be scaled down by a factor of Nt . By stacking the received

samples at discrete time instances, r(m) can be described by
 
 r(mNtot ) 
 
.
r(m) = 
 .
.

. (2.7)
 
 
r(mNtot + Ntot − 1)

To combat the ISI, the first Ng Nr elements of r(m) must be removed completely. The

resulting ISI-free OFDM symbol y(m) is
 
 y(mN ) 
 
.
y(m) = 
 .
.

 = ADeCP r(m), (2.8)
 
 
y(mN + N − 1)

where

ADeCP = 0 IN ⊗ INr .

By exploiting the property that u(m) is a cyclic extension of s(m) so that cyclic

discrete-time convolution is valid, the relation between s(m) and y(m) can be ex-

pressed as

y(m) = hm,Cir s(m) + ADeCP v(m), (2.9)

where hm,Cir is an N Nr × N Nt block circulant matrix. In general, an N Nr × N Nt

block circulant matrix is fully defined by its first N Nr × Nt block matrices. In our

case, hm,Cir is determined by
 
 h0,m 
 . 

 .
. 

 .
 
 hL−1,m 
 
 
0(N −L)Nr ×Nt

39

Finally taking FFT on the y(m) at the receiver, we obtain the frequency domain

MIMO-OFDM baseband signal model

Y (m) = (FN ⊗ INr )y(m)
= (FN ⊗ INr )(hm,Cir s(m) + ADeCP v(m))
(2.10)
1
= ( N )(FN ⊗ INr )hm,Cir (FH ⊗ INt )S(m) + (FN ⊗ INr )ADeCP v(m)
N

= Hm,diag S(m) + V (m).

In the above expression, V (m) represents the frequency domain noise, which is i.i.d.

(independent and identically distributed) zero-mean and complex Gaussian random

1 2
variable with variance 2 σv per dimension, and Hm,diag is a block diagonal matrix

which is given by  
 H0,m 
 

Hm,diag =  ... 
.
 
 
HN −1,m
The k-th block diagonal element is the frequency response of the MIMO channel at

L−1 2π
the k-th subcarrier and can be shown to be Hk,m = l=0 hl,m e−j N kl . So for that

subcarrier, we may write it in a simpler form

Y (mN + k) = Hk,m S(mN + k) + V (mN + k), (2.11)

where  
 H11 (k, m) ··· H1Nt (k, m) 
 
 .
. ... .
. 
Hk,m = 
 . . .

 
HNr 1 (k, m) · · · HNr Nt (k, m)
This leads to a ﬂat-fading signal model per subcarrier and it is similar to the SISO

signal model, except that Hk,m is an Nr × Nt matrix.

40

2.2.2 Preliminary Analysis

Based on those assumptions such as perfect synchronization and block fading, we end

up with a compact and simple signal model for both the single antenna OFDM and

MIMO-OFDM systems. Surely it is an ideal model that says, considering first a noise

free scenario, the received signal on the k-th subcarrier is just a product (or matrix

product for MIMO case) of the transmitted signal on the k-th subcarrier and the

discrete-time channel frequency response at the k-th subcarrier. Noise in frequency

domain can also be modeled as an additive term. When it comes to channel estimation

for OFDM systems, this model is still valid since there is no ICI as we assume.

For channel estimation of MIMO-OFDM systems, it is appropriate to estimate

the channel in time domain rather than in frequency domain because there are few

parameters in the impulse response (Nt Nr L coefficients) than in the frequency re-

sponse (Nt Nr N coefficients). Given the limited number of training data that can be

sent to estimate the fast time-varying channel, limiting the number of parameters to

be estimated would increase the accuracy of the estimation. This is the thrust of the

estimation technique in this thesis. The estimation algorithm we propose is based on

pilot tones, namely known data in the frequency domain. Since the signal model of

OFDM in (2.11) is in the frequency domain too, it is necessary to find the relations

between the CFR and the CIR. Discrete-time Fourier transform is a perfect tool we

41

can use to describe the relation. It is shown as
 
hm
H m = F N Nr 


,
0(N −L)Nr ×Nt

where  
 H0,m 
 
.
Hm = 
 .
.

; FN Nr = FN ⊗ INr .
 
 
HN −1,m
Since the channel length L is less than the FFT size N , only the ﬁrst LNr columns

of FFT matrix FN Nr are involved in calculation. It gives us another form to describe

the relation as

Hm = FN Nr (1 : Nr L)hm , (2.12)

where FN Nr (:, 1 : Nr L) is an N Nr × Nr L submatrix of FN , consisting of its ﬁrst Nr L

columns. FN Nr (:, 1 : Nr L) is a ’tall’ matrix and its left inverse exists. That implies

the equation in (2.12) is an overdetermined system. To determine hm , we can easily

multiply the left inverse of FN Nr (:, 1 : Nr L) in the two sides of the equation. This

requires full information for the channel frequency response matrix Hm . That is not

necessarily to be true. Actually if we know L of the N matrices {H0,m , . . . , HN −1,m },

then hm can be calculated. For example, in the SISO case, if we know the channel

frequency response at any L subcarriers {Hk1 ,m , . . . , HkL ,m }, then the channel impulse

response h(m) can be uniquely determined. This is the base for pilot-tone based

channel estimation of OFDM systems. Pilot-tones are the selected subcarriers over

which the training data are sent. The question then arises as to which tones should be

used as pilot-tones and the impact of pilot-tones selection on the quality of estimation.

42

Cioffi’s paper [49] addressed this issue first that one should choose the sets of equally-

spaced tones as pilot tones, to avoid the noise enhancement effect in interpolating the

channel impulse response from the frequency response. Assume that N = mL and the

integer m > 1. This is a realistic assumption since the OFDM block size N is often

chosen to be 128, 256 or even a larger value and the channel length of MIMO-OFDM

channel is usually not greater than 30. For the typical urban (TU) model [47] of delay

profile with RMS delay τrms = 1.06µs, the channel length is L = τrms × 20MHz+1

≈ 23 in an 802.11a system with a bandwidth of 20MHz. In systems like DVB-T and

WiMax [40, 41], N is even a much bigger integer. Since N = M L, there could be m

equally-sized pilot tones sets. Define
   
 Hp,m   1 
   
(p)  .
.  (p)  p 
Hm = 
 . 

WN = 
 WN  ⊗ INr ,

(2.13)
   p(L−1)

Hp+(L−1)M,m WN
2π
where p is any integer such that 0 ≤ p ≤ m − 1 and WN = e−j N . Clearly H(p) is
m

(p) (p)
the p-th down-sampled version of Hm , and WN simply acts as a shift operator of

order p. The CFR matrix Hm can be decomposed into M disjoint down-sampled

submatrices {H(p) }M −1 , each composed of L equally-spaced CFR sample matrices. It
m p=0

can be verified via straightforward calculation that

(p) (p)
Hm = FLNr WN hm p = 0, 1, · · · , M − 1, (2.14)

where FLNr is a LNr × LNr DFT matrix. It indicates that the channel state infor-

mation represented by hm can be obtained from a down-sampled version of Hm , i.e.,

43

(p)
Hm , which only requires us to probe the unknown channel frequency response with

some training data on the selected p-th pilot-tones set. The procedure of pilot-tone

based channel estimation is illustrated in Figure 2.2.

S ( P) (m)

Y ( P) (m) S ( P) (m)hC (m) V ( P) (m) hC (m)

Y ( P) (m)

Figure 2.2: The concept of pilot-based channel estimation

And it is also true that

(p)
H(p) (:, i) = FLNr WN hm (:, i),
m (2.15)

where H(p) (:, i) and hm (:, i) are the i-th column of H(p) and hm respectively and 1 ≤
m m

i ≤ Nt . After discussing the relation between the CIR hm and the p-th down-sampled

CFR Hm , we return to the input-output relationship of MIMO-OFDM system

Y (mN + k) = Hk,m S(mN + k) + V (mN + k), (2.16)

where
     
Y1 (mN + k) S1 (mN + k) V1 (mN + k)
 .   .   . 
Y (mN + k) =  .  ; S(mN + k) =  .  ; V (mN + k) =  . .
 .   .   . 
YNr (mN + k) SNt (mN + k) VNr (mN + k)

44

are the received signal, the transmitted signal and the noise term respectively as

deﬁned in the previous section. They are repeated here for convenience. In order to

get a useful form for channel estimation based on pilot-tones, we have to manipulate

the expression in (2.16) so that the transmitted signal and the CFR terms exchange

their position in the product. (2.16) can be equivalently rewritten as

Y (mN + k) = S1 (mN + k)Hk,m (:, 1) + · · · + SNt (mN + k)Hk,m (:, Nt ) + V (mN + k).

(2.17)

Basically we transform the product of a matrix and a vector into a summation of

products of a scalar and a vector. The noise term remains unchanged. This trans-

formation is speciﬁed to the k-th subcarrier. If we consider all the N subcarriers, we

need stack {Y (mN + k)}’s and {Hm (:, i)}’s together and construct a block diagonal

matrix for the {S(mN + k)}’s. It can be shown that

Y (m) = Sdiag,1 (m)Hm (:, 1) + · · · + Sdiag,Nt (m)Hm (:, Nt ) + V (m), (2.18)

where Y (m) and V (m) are the received signal and the noise term respectively given

by
     
Y (mN ) V (mN ) H0,m (:, i)
 .   .   . 
Y (m) =  .  ; V (m) =  . ; Hm (:, i) =  . ,
 .   .   . 
Y (mN + N − 1) V (mN + N − 1) HN −1m (:, i)

and
 
Sdiag,i (mN )
 
Sdiag,i (m) = 

..
.
 ⊗ IN ;
 r 1 ≤ i ≤ Nt .
Sdiag,i (mN + N − 1)

Here the dimensions of the above column vectors and matrices are very large, for

instance, Y (m) is an N Nr × 1 column vector. The computational load, however, is

45

not changed since Sdiag,i (m) is a block diagonal matrix, compared to the expression

in (2.10).

As proved in [49], pilot-tones should be equally-powered and equally-spaced to

achieve the MMSE (minimum mean squared error) of channel estimation. Let {Si (mN +

p), Si (mN + M + p), · · · , Si (mN + (L − 1)M + p)} represent a set of L pilot-tones

with index p which are transmitted simultaneously along with the other N − L data

signals at the m-th block from the i-th antenna. Obviously one pilot-tone is placed

every M subcarriers in one OFDM block. Hence we can also have a down-sampled

version of equation (2.18) by selecting a sampled element every M subcarriers. Since

we assume that there is no ICI, we can neglect the data symbol which are transmitted

together with pilot symbol. We only consider the p-th set of pilot-tones on the p-th,

the (p + M )-th,... and the (p + (L − 1)M )-th subcarriers, and so are the received

signals. It turns out to be

(p) (p)
Y (p) (m) = Sdiag,1 (m)Hm (:, 1) + · · · + Sdiag,Nt (m)H(p) (:, Nt ) + V (p) (m),
(p)
m (2.19)

where
     
Y (mN + p) V (mN + p) Hp,m (:, i)

Y (p) (m) =  .
. ; V (p) (m) =  .
. ; (p)
Hm (:, i) =  .
. ,
. . .
Y (mN + (L − 1)M + p) V (mN + (L − 1)M + p) H(L−1)M +p,m (:, i)

and
 
Sdiag,i (mN + p)
(p)
Sdiag,i (m) =  ..
.
 ⊗ IN ;
r 1 ≤ i ≤ Nt
Sdiag,i (mN + (L − 1)M + p)

are all the p-th down-sampled versions. In the equation (2.19), we obtain the relation

between Y (p) (m) and H(p) (:, i). To estimate the channel in time domain, we need
m

46

explicitly relate Y (p) (m) with hm . Plugging (2.15) into (2.19) yields
(p) (p) (p) (p)
Y (p) (m) = Sdiag,1 (m)FLNr WN hm (:, 1) + · · · + Sdiag,Nt (m)FLNr WN hm (:, Nt ) + V (p) (m).
(2.20)

To estimate those unknown {hm (:, 1), · · · , hm (:, Nt )}, one set of pilot-tones is not ad-

equate for estimation. That is diﬀerent from the SISO case in which any one of the M

pilot-tone sets can be utilized to estimate the channel. For MIMO-OFDM channel es-

timation, we need, at least, Nt disjoint sets of pilot-tones indexed by {p1 , p2 , . . . , pNt }.

It is assumed that N = M L and hence there are totally M = N/L diﬀerent sets. It

indicates a constraint imposed on the selection of FFT size N for MIMO systesm, i.e.,

N ≥ Nt L. This observation tallies with the result in [48]. In practice, the selection

of N determines the number of subcarriers utilized in the system. For systems like

WLAN and WiMax [39, 40], N is not very large because a larger N means narrower

subcarrier spacing which may cause severe ICI. Furthermore, those systems often

operate in a low SNR environments.

2.3 Channel Estimation and Pilot-tone Design
2.3.1 LS Channel Estimation

Assume that we have Nt disjoint sets of pilot-tones. Then we have the following

observation equations.
(p ) (p ) (p ) (p )
Y (p1 ) (m) = Sdiag,1 (m)FLNr WN 1 hm (:, 1) + · · · + Sdiag,N (m)FLNr WN 1 hm (:, Nt ) + V (p1 ) (m)
1 1
t
.
.
. (2.21)
(pNt ) (pNt ) (pN ) (pNt ) (pN ) (pNt )
Y (m) = Sdiag,1 (m)FLNr WN t hm (:, 1) + ··· + Sdiag,N (m)FLNr WN t hm (:, Nt ) +V (m)
t

To use LS (least square) method for channel estimation, we usually put those obser-

vation equations into a matrix form. LS is a well-known method and widely used for

47

estimation. We choose LS rather than other methods like MMSE channel estimation

for the simplicity of implementation. In a matrix form, it is described by

Y (P ) (m) = S(P ) (m)hC (m) + V (P ) (m), (2.22)

where
     
(p1 ) (m) (p1 ) (m)
 Y   hm (:, 1)   V 
 .   .   . 
Y (P ) (m) =  .
.  ; hC (m) =  .
.  ; V (P ) (m) =  .
. ,
     
     
Y (pNt ) (m) hm (:, Nt ) V (pNt ) (m)

and  
(p ) (p ) (p ) (p )
 Sdiag,1 (m)FLNr WN 1
1
··· Sdiag,Nt (m)FLNr WN 1
1

 . .. . 
S(P ) (m) =  .
. . .
. .
 
 
(p ) (p ) (p ) (p )
Sdiag,1 (m)FLNr WN Nt
Nt
· · · Sdiag,Nt (m)FLNr WN Nt
Nt

In the above expression, S(P ) (m) is an Nt Nr L×Nt Nr L square matrix, composed of Nt2
(p )
pilot-tone block matrices {Sdiag,j (m)}Nt . At each transmit antenna Nt sets of pilot-
i
i,j=1

tones are transmitted with the same index {p1 , p2 , · · · , pNt }. Assume that Nt ≤ M =

N
L
. It can also be seen that the total number of unknown CIR parameters Nt Nr L

cannot be greater than the total number of received signals N Nr , i.e., N tN rL ≤

N
N N r ⇔ Nt L ≤ N ⇔ Nt ≤ L
.

The standard solution to the LS channel estimates [50] is known as

ˆ
hC,LS (m) = [(S(P ) (m))H S(P ) (m)]−1 (S(P ) (m))H Y (P ) (m). (2.23)

Obviously the matrix S(P ) (m) is of huge size and it has Nt2 Nr L2 elements. Compu-
2

tation of the inverse for such a large size matrix is undesirable. Therefore, an intu-

itive solution is to design the square matrix S(P ) (m) such that (S(P ) (m))H S(P ) (m) =

48

1
S(P ) (m)(S(P ) (m))H = aINt Nr L , a ∈ R+ , or equivalently √ S(P ) (m)
a
is a unitary ma-

trix. Then the LS channel estimates can be easily obtained as

ˆ 1
hC,LS (m) = hC,LS (m) + (S(P ) (m))H V (P ) (m). (2.24)
a

2.3.2 Pilot-tone Design

In order to have a simple and efficient LS algorithm for channel estimation, we have

to design the square matrix S(P ) (m) deliberately. In this section, the design will be

illustrated by a theorem and an example.

The preamble design discussed in [50] adopted Tarokh’s approach [18] to space-

time block code construction. It could be related to orthogonal design to which

our pilot-tone design also has a connection. In each of the first Nt training blocks

in a frame, a group of at least L pilot-tones are equally-placed and all the other

tones are set to zeros. LS channel estimation can then be obtained based on the

known pilot-tones. The channel is assumed to be unchanged for the rest of the whole

frame. In a mobile environment, however, we cannot guarantee that the channel state

information estimated at the m-th block still holds true at the (m + Nt )-th block.

Hence the preamble design in [50] is not suitable to be applied to the fast time-varying

channels. In addition to this common disadvantage, the training sequences designed

in [48] have to satisfy a condition called local orthogonality. It requires that, for the

Nt different training sequences with length N , they are orthogonal over the minimum

set of elements for any starting position. The pilot design proposed in this paper aims

to remove the disadvantage and the constraint mentioned above. It actually has its

49

roots to Table I in [16], but it is not implemented in space and time domain. On the

contrary, it is accomplished in space and frequency domain. We explicitly connect

pilot-tone design with space-frequency coding so that we have more insights on its

design. Denote EP as the ﬁxed total power for all the pilot-tones at each transmit

EP
antenna. Then the power allocated on each pilot-tone is Nt L
since pilot-tones are

all equalspaced and equalpowered. In some systems, the power of those pilot-tones

could be larger than the power of data symbols for a better estimation of the wireless

channel. We assume in our work that the pilot-tones and other data are all equally

normalized such that the average power for all diﬀerent mappings is the same. Our

pilot-tone design is illustrated in the following theorem.

(p ) EP
Theorem 2.1 Let Sdiag,j (m) = αpi ,j ILNr , |αpi ,j | =
i
Nt L
, i, j = 1, 2, · · · , Nt , then

√1 S(P ) (m) is a unitary matrix if
EP
 
(p1 ) (p1 )
 Sdiag,1 (m) ··· Sdiag,Nt (m) 
L  
(P )  .
. ... .
. 
SSF C (m) =  . . 
EP  
 (pNt ) (p
Nt ) 
Sdiag,1 (m) · · · Sdiag,Nt (m)

is a unitary matrix.

50

Proof.

S(P ) (m)
 
(p1 ) (p ) (p1 ) (p )
 FLNr Sdiag,1 (m)WN 1 ··· FLNr Sdiag,Nt (m)WN 1 
 
. ... .
=
 .
. .
.


 
 (p ) (p ) (pNt ) (pNt ) 
Nt
FLNr Sdiag,1 (m)WN Nt · · · FLNr Sdiag,Nt (m)WN
 
(p1 ) (p1 ) (p ) (p )
 FLNr WN Sdiag,1 (m) ··· FLNr WN 1 Sdiag,Nt (m)
1

 
 .
. .. .
. 
=
 . . . 

 (pN ) (pN ) (p ) (p ) 
FLNr WN t
Sdiag,1 (m) · · · FLNr WN Nt Sdiag,Nt (m)
t Nt

(P ) EP (P )
= FLNr WN ( L
)SSF C (m),

where
   
(p )
 FLNr   WN 1 
   
FLNr = 

...  (P )
 , WN = 
 ... 
.
   
   (pNt ) 
FLNr WN

H H (P )
It is easy to see that FLNr FLNr = FLNr FLNr = LINt Nr L and WN is a unitary matrix.

Hence S(P ) (m)(S(P ) (m))H = (S(P ) (m))H S(P ) (m) = EP INt Nr L . This completes the

proof. 2

Clearly each of the Nt different pilot-tone sets has the same L elements. That is

because, for example, An×n Bn×n = Bn×n An×n if B = In . Or put it in another way,

we can turn the product AB into BA by moving B to the front of A. It is a simple

manipulation of the mathematical derivation. In general, the product of two square

matrices, AB is not equal to BA. But it turns out to be true if B is a square identity

matrix. Then we can find that this assumption greatly simplifies the pilot-tone design

for a MIMO-OFDM system with a large number of transmit antennas. It reduces to

the design of a square orthogonal matrix. Hence we are more interested in the design

51

(P )
of SSF C (m). First we consider a simple example with 2 transmit antennas and 2

receive antennas, i.e., Nt = Nr = 2 in the previous equations. Assume the channel

length L = 4. By Theorem 2.1, we use Alamouti’s structure [16]
 
 x y  EP
  , |x|2 + |y|2 = 4 , x, y ∈ C.
∗ ∗
−y x

The above leads to the design
 
(p1 ) (p1 )
(P ) 4  Sdiag,1 (m) Sdiag,2 (m)

SSF C (m) =  , (2.25)
EP S(p2 ) (m) S(p2 ) (m)
diag,1 diag,2

where
(p ) (p )
Sdiag,1 (m)) = xI8 ,
1
Sdiag,2 (m)) = yI8
1

(p ) (p )
Sdiag,1 (m) = −y ∗ I8 , Sdiag,2 (m) = x∗ I8 .
2 2

The placement of pilot-tones in the example is shown in Figure 2.3. It can be seen

in the figure that red and purple square boxes symbol the first and the second pilot-

tone sets for TX antenna 1 respectively, and so are the green and light blue for TX

antenna 2. They are all equally-spaced and the same color for each set implies that

they are the same pilot symbols. For this example, there are total 16 pilot-tones

and they are allocated to two TX antennas easily by our proposed method. The
(P )
square matrix SSF C (m) is actually a space-frequency code. In the column direction,

it is signified by the TX antennas, namely the spatial domain; In the row direction,

it is denoted by different pilot-tone sets, namely the frequency domain. Hence our

design explicitly clarifies the connection between conventional pilot-tone design and

the space-frequency code design [32, 33] aiming at performance enhancement.

When we have more than 2 transmit antennas, i.e., Nt ≥ 3, it is also very easy

52

Tx_1 Tx_2
m-th OFDM symbol

(m+1)-th OFDM symbol 1

8

: 1st pilot set @ Tx_1

: 2nd pilot set @ Tx_1

16
: 1st pilot set @ Tx_2

: 2nd pilot set @ Tx_2

: data
24

32

(m+2)-th OFDM symbol

Figure 2.3: Pilot placement with Nt = Nr = 2

53

(P )
to design an Nt Nr L × Nt Nr L unitary matrix SSF C (m). Based on the assumption

in Theorem 2.1 that all the pilot-tones within one set are all the same, the design
(P )
of SSF C (m) can be simplified to the design of an Nt × Nt unitary matrix S and the

complexity is reduced from Nt Nr L to Nt :
 
 αp1 ,1 ··· αp1 ,Nt 
L  
 .
. .. .
. 
S=  . . .  .
EP  
 
αpNt ,1 · · · αpNt ,Nt
Nt ×Nt
−˜ N ij
j 2π √
Choose αpi ,j = EP
LNt
e t , ∀i, j ∈ {1, 2, . . . , Nt }, ˜ =
j −1. Then S can be shown

to be a unitary matrix. Basically it is very close to an Nt -point FFT matrix. After
(P )
obtaining the {αpi ,j }Nt , SSF C (m) can be easily constructed from Theorem 2.1 by
i,j=1

mapping a scalar to a diagonal matrix with its diagonal elements all equal to that

scalar.

2.3.3 Performance Analysis

With the fixed total power EP , the pilot-tones designed in the previous section can be

shown to be optimal in the sense that it achieves the minimum mean squared error of

the channel estimation. This is shown in the following. From (2.24), MSE of channel
ˆ
estimates hC,LS (m) is given by
1 ˆ
MSEm = Nt Nr L
E{ hC,LS (m) − hC,LS (m) 2 }
1
= 2
EP Nt Nr L
E{ (S(P ) (m))H V (P ) (m) 2 }
(2.26)
1
= 2
EP Nt Nr L
tr{(S(P ) (m))H E[V (P ) (m)V (P ) (m)H ]S(P ) (m)}
2
σn
= 2
EP Nt Nr L
tr{(S(P ) (m))H INt Nr L S(P ) (m)}.
Since S(P ) (m)(S(P ) (m))H = (S(P ) (m))H S(P ) (m) = EP INt Nr L , then MSE achieves its
2
σn
minimum as MSEmin = EP
. At this point, we can find that the unitary matrix design

54

not only reduces the complexity of the channel estimator, but also ensures that it has

the least estimation error, if the pilit-tones have fixed transmit power.

2.4 An Illustrative Example and Concluding
Remarks
2.4.1 Comparison With Known Result

In this section, we demonstrate the performance of the proposed channel estimation

based on our optimal pilot-tone design through computer simulations. In order to

have a clear look at the performance improvement, other channel estimation technique

[50] is also simulated. We consider a typical MIMO-OFDM system with 2 transmit

antennas and 2 receive antennas. The OFDM block size is chosen as N = 128 and a

CP with length of 16 is prepended to the beginning of each OFDM symbol. The four

sub-channel paths denoted by {h11 , h12 , h21 , h22 } are assumed to be independent to

each other and have a CIR with length L = 16 individually. Those CIR coefficients in

each sub-channel are simulated by the Jakes’ model [51]. Our simulation is conducted

in two ways:

• Method I: Place two sets of L = 16 pilot-tones into each OFDM block and the

pilot-tones are equally-spaced and equally-powered as shown in Figure 2.3;

• Method II: Set the first two OFDM blocks of each data frame, which includes ten

OFDM blocks, as preamble. Put L = 16 equally-spaced and equally-powered

55

pilot-tones into each of the first two preamble block and set all the other tones

as zeros. (see [50] for detailes).

To illustrate the mobile environments, different Doppler shifts are simulated as

fd = 5, 20, 40, 100 and 200 Hz. The performance of the system is measured in terms

of the MSE of the two different channel estimation schemes mentioned above and the

symbol error rate (SER) versus SNR. For a reliable simulation, total 10,000 frames

are transmitted for each test. Then the average values of MSE and SER are taken as

the measurements. In Figure 2.4, the Doppler shift is 5 Hz and the two curves marked

with “known channel” serve as the performance bound since we know the channel

state information exactly. This is totally unrealistic and is just for the purpose of

comparison. We can find the two curves corresponding to both RX antenna 1 and

RX antenna 2 are nearly merged together. This matches our expectation since there

is no difference between the two receive antennas statistically. It also can be found

that the two curves generated by channel estimation based on our optimal pilot-

tone design is close to the performance bound, just a narrow gap between them

due to the ever-existing channel estimation error. On the contrast, the two curves

generated by channel estimation based on the technique in [50] is far away from the

performance limit, even with a large SNR. It justifies our point that the method

based on preamble at the beginning of a frame is not applicable to a fast varying

wireless channel. Through Figure 2.4 to Figure 2.6, the performance of the system

based on the proposed pilot-tone design does not change a lot since it keeps tracking

56

the channel by the pilot-tones in each OFDM block. The difference between the two

estimation schemes is illustrated in the MSE plots. In Figure 2.7, for a fixed SNR

value, the curves for different Doppler spreads do not change that much and that

implies that the method we proposed is able to track the fast time-varying channel.

For a specific SNR value, the curves in Figure 2.8 do change along with the different

Doppler shifts. It can be seen that the estimation error when fd = 200 Hz is much

larger than the one when fd = 5 Hz in Figure 2.8. It indicates that the method based

on preambles works poorly when Doppler spread is small, and does not work when

the channel is changing quickly.

fd=5 Hz
0
10
Rx1 KnownChanel
Rx2 KnownChanel
Rx1 PilotTone−based
Rx1 Preamble−based

−1
10
Symbol Error Rate

−2
10

−3
10
5 10 15 20 25 30
SNR (in dB)

Figure 2.4: Symbol error rate versus SNR with Doppler shift=5 Hz

57

fd=40 Hz
0
10

−1
10
Symbol Error Rate

−2
10 Rx1 KnownChanel
Rx2 KnownChanel

−3
10
5 10 15 20 25 30
SNR (in dB)


fd=200 Hz
0
10

−1
10
Symbol Error Rate

−2
10 Rx1 KnownChanel
Rx2 KnownChanel

−3
10
5 10 15 20 25 30
SNR (in dB)


58

Normalized MSE of Pilot−tone Based Channel Estimator

−3
x 10

4

3.5

3
Normalized MSE

2.5

2

1.5

1

0.5

0
200

150 30
25
100 20
50 15
10
0 5
Doppler Shift (in Hz)
SNR (in dB)

Figure 2.7: Normalized MSE of channel estimation based on optimal pilot-tone design

Normalized MSE of Preamble Based Channel Estimator

−3
x 10

8

7

6
Normalized MSE

5

4

3

2

1
200

150 30
25
100 20
50 15
10
0 5

Doppler Shift (in Hz)
SNR (in dB)

Figure 2.8: Normalized MSE of channel estimation based on preamble design

59

2.4.2 Chapter Summary

We presented a new optimal pilot-tone design for MIMO-OFDM channel estimation.
(P )
Nt sets of L pilot-tones coded in SSF C (m) are transmitted at each antenna simulta-

neously and the channel can be estimated optimally. The main advantage is rooted

in its ability to handle fast time-varying system since channel can be estimated at

each OFDM block and its simpleness since the orthogonal design makes the MIMO

system be easily processed in a parallel way.

For an Nt × Nr MIMO system, the complexity of any kinds of signal processing

algorithms at the physical layer is increased usually by a factor of Nt Nr . To name

a few, channel estimation, carrier frequency oﬀset estimation and correction and IQ

imbalance compensation all become very challenging in MIMO case. In this chapter,

we provide solutions to the following “how” questions. How many pilot tones are

needed? How are they placed in one OFDM block? Most importantly, how fast

can channel estimation be accomplished? We propose a pilot tone design for MIMO-

OFDM channel estimation that Nt disjoint sets of pilot tones are placed on one OFDM

block at each transmit antenna. For each pilot tone set, it has L pilot tones which are

equally-spaced and equally-powered. The pilot tones from diﬀerent transmit antennas

comprise a unitary matrix and then a simple least square estimation of the MIMO

channel is easily implemented by taking advantage of the unitarity of the pilot tone

matrix. There is no need to compute the inverse of large-size matrix which is usually

required by LS algorithm.

60

In a highly mobile environment, like a mobile user in a vehicle riding at more than

100km/hr, the wireless channel may change within one or a small number of symbols.

For example as in [30], in IEEE 802.16-2004 Standard with N = 256, G = 44 (N :

FFT size; G: guard interval) and 3.5MHz full bandwidth, the symbol duration is

about 73 microseconds. For a user in a vehicle traveling with 100km/hr, the channel

coherent time is about 1100 microseconds. That means the wireless channel varies

after around 15 symbols. In a real-time communication scenario, the information

packet could contain hundreds of data symbols or even more. Our scheme is proposed

in this chapter that we distribute the pilot symbols in the preamble to each OFDM

block for channel estimation. Since the pilot tones are placed on each OFDM block,

the channel state information can be estimated accurately and quickly, no matter

how fast the channel condition is varying. It is fair to point out that we may have a

higher overhead rate compared to the methods in the literature. Therefore our pilot

design can also be applied to a slow time-varying channel by placing pilot tones on

every a few number of OFDM blocks. That can reduce the channel throughput loss.

The orthogonal pilot tone matrix is indeed a space-frequency code. The row

direction of the matrix stands for diﬀerent pilot tone sets in the frequency domain, and

the column direction represents the individual transmit antennas in spatial domain.

And it can be readily extended to an Nt × Nr MIMO system by constructing an

Nt × Nt orthogonal matrix. With this explicit relation to space-frequency code, the

61

design of pilot tone matrix for MIMO-OFDM channel estimation can be conducted

in a more broad perspective.

Chapter 3

Wireless Location for
OFDM-based Systems

3.1 Introduction

Wireless networks are primarily designed and deployed for voice and data commu-

nications. The widespread availability of wireless nodes, however, makes it feasible

to utilize these networks for wireless location purpose as an alternative to the GPS

(global positioning system) location service. It is expected that location-based ap-

plications will play an important role in future wireless markets. The commercially

available location technology is implemented on cellular networks and WLAN, such

as E911 (Enhanced 911) and indoor positioning with WiFi (wireless ﬁdelity). In this

dissertation, we are investigating wireless location technology aimed at a diﬀerent

network, i.e., WiMax system.

3.1.1 Overview of WiMax

WiMax is an acronym for Worldwide Interoperatability for Microwave Access. It is

not only a technical term indicating a new wireless broadband technology, but also is

62

63

referred to as a series of new products working on this network. The real WiMax-based

wireless gears do not come to the market yet. But people are already very familiar

with the WiFi-based products such as notebook wireless cards and wireless routers

from Linksys, D-Link and Belkin, while they are checking their emails or surfing

on Internet wirelessly on campus or at airports, hotels, bookstores and coffee shops.

WiFi stands for Wireless Fidelity and it is the first available technology for WLAN and

wireless home networking. However it is constrained by its limited coverage of about

50-100 meters and relatively low data rate. Different from WiFi, WiMax is another

new broadband wireless access technology that provides very high data throughput

over long distance in a point-to-multipoint and line of sight (LOS) or non-line of

sight (NLOS) environments. In terms of the coverage, WiMax can provide seamless

wireless services up to 20 or 30 miles away from the base station. It also has an IEEE

name 802.16-2004. It is this IEEE standard that defines the specifics of air interface

of WiMax.

WiMax Standards

Actually microwave access is not a new technology for broadband systems. Propri-

etary point-to-multipoint broadband access products from companies like Alcatel and

Siemens have existed for decades. They did not get their popularity because they are

extremely proprietary. Today’s WiMax is attempting to standardize the technology

to reduce the cost and to increase the range of applications. The current standard for

WiMax is IEEE std 802.16-2004. It can be easily downloaded at IEEE website. With

64

its approval in June 2004, it renders the previous standard IEEE std 802.16-2001 and

its two amendments 802.16a and 802.16c obsolete. Now IEEE 802.16-2004 can only

address the fixed broadband systems. IEEE 802.16 Task Group e is working on an

amendment to add mobility component to the standard. The new standard may be

named as IEEE 802.16e.

WiMax Applications

We have seen a lot of marketing efforts on WiMax applications at conferences, exhi-

bitions and other media. People are wondering if it is a must technology in the near

future. Let’s have a look at the fact that what kind of broadband services we can have

today. We usually resort to a landline connection with T1, DSL and cable modems.

WiMax or 802.16 is proposed to address the first mile/last mile wireless connection

in a metropolitan area network. It can change the last-mile connection as much as

802.11 did for the change of the last hundred feet connection. It may change not

only for the rural areas, but also for anyplace where the cost of laying or upgrading

landline to broadband capabilities is prohibitively expensive. WiMax’s primary use

will most likely come in the form of metropolitan area network. In terms of services

and applications, it is different from the traditional WiFi standards which include

802.11a, 802.11b and 802.11g. The WiFi technology with a maximum range of 800

feet outdoors mainly intend to be used in local area networks to provide services for

residential homes, for public hot spots like airports, hotels and coffee shops, and for

small business buildings. With its much longer range, in theory WiMax can reach a

65

maximum of 31 miles, and WiMax can provide broadband services to thousands of

homes in a metropolitan area. Imagine that a broadband service provider can serve

thousands of residential homes, small and large scale business buildings without the

cost of laying out physically running lines and dispatching the technicians for instal-

lations and maintenance of the lines. The savings will push them to choose WiMax

and to reduce the charge fees for their customers. Another driving force for WiMax

is its speed. It can transfer the data with a rate up to 70 Mbps which is equivalent

to almost 60 T1 lines. Combining its long range with the high-speed, it is why the

application of WiMax is endless. All of these sound great enough though, the real

WiMax products are not commercially available in the market yet. There are only

some pre-WiMax products based on the standard coming up. But it will come soon.

For example, Intel’s PRO/Wirelss 5116 is a highly integrated IEEE std 802.16-2004

compliant system on chip for both licensed and license-exempt radio frequencies.

3.1.2 Overview to Wireless Location System

Wireless location refers to determination of the geographic coordinates, or even the

velocity and the heading in a more general sense, of a mobile user/device in a cel-

lular, WLAN or GPS environments. Usually wireless location technologies fall into

two main categories: handset-based and network-based. In handset-based location

systems [55], the mobile station equipped with extra electronics determines its lo-

cation from signals received from the base stations or from the GPS satellites. In

GPS-based estimations, the MS (mobile station) receives and measures the signal

66

parameters from at least four satellites of a currently existing constellation of 24 GPS

satellites. The parameter of which the MS measures is the time for each satellite

signal to reach the MS. GPS systems have a relatively higher degree of accuracy and

they also provide global location information. However, embedding a GPS receiver

into mobile devices leads to increased cost, size and battery consumption. It also re-

quires the replacement of millions of mobile handsets that are already in the market

with new GPS-featured handsets. In addition, the accuracy of GPS measurements

degrades in urban and indoor environments. For these reasons, some wireless carriers

may be unwilling to embrace GPS fully as the only location technology.

On the other hand, network-based location technology relies on the ever existing

network infrastructures to determine the position of a mobile user by measuring

its signal parameters when received at the network BSs (base stations). This may

require some hardware upgrade or installation at the BSs, but the cost can be shared

by a huge number of mobile subscribers and it does not affect the users in using

their mobile devices. In this technology, the BSs measure the signal transmitted

from an MS and relay them to a central site for further processing and data fusion

to provide an estimate of the MS location. Network-based technologies have the

significant advantage that the MS is not involved in the location-finding process;

thus the technology does not require modifications to existing handsets. However,

unlike GPS location systems, many aspects of network-based location are not yet

fully studied. In Figure 3.1, network-based wireless location technology is illustrated.

67

3

T(D)OA/AOA
Estimator r3 2

T(D)OA/AOA
Estimator
BS3 (x3, y3 ) r2
MS
r1

1 BS 2 ( x2, y2 )
T(D)OA/AOA
Estimator

BS1 ( x1, y1 )

Data Fusion
Center

Figure 3.1: Network-based wireless location technology (outdoor environments)

Network-based wireless location technology gains more recognition with the in-

creasing number of wireless subscribers and the demands for some location-oriented

services such as E911. It is estimated [56, 57] that location based service will generate

annual revenues in the order of $ 15B worldwide. In U.S. alone, about 170 million

mobile subscribers are expected to become covered by the FCC mandated location

accuracy for emergency services. The following is a partial list of applications that

will be enhanced by using wireless location information [58].

• E911. Nowadays a high percentage of E911 calls are generated from mobile

phones; the percentage is estimated [59, 60] to be at one third of all 911 calls

(170,000 per day). These wireless 911 calls do no receive the same quality of

emergency assistance as those ﬁxed-network 911 calls enjoy. This is due to

68

the unknown position of the wireless 911 caller. To fix this problem, FCC

issued an order on July 12, 1996 [59], requiring all wireless service providers to

report accurate MS location to the E911 operator at the PSAP (public safety

answering point). In the FCC order, it was mandated that within five years

from the effective date of the order, October 1, 1996, wireless service providers

must convey to the PSAP the location of the MS within 100 meters of its

actual position for at least 67 percent of all wireless E911 calls. This FCC order

has motivated considerable research efforts towards developing accurate wireless

location algorithms for cellular networks and has led to significant enhancement

to the wireless location technology.

• Mobile advertising. Location-specific advertising and marketing will benefit

once the location information is available. For example, stores would be able to

track customer locations and to attract them in by flashing customized coupons

on their wireless devices [61]. In addition, a cellular phone or a PDA (personal

digital assistant) could act as a smart handy mobile yellow pages on demand.

• Asset tracking (indoor/outdoor). Wireless location technology can also

assist in advanced public safety applications such as locating and retrieving

lost children, patient, or even pets. In addition, it can be used to track per-

sonnel/assets in a hospital or a manufacturing site to provide a more efficient

management of assets and personnel. One could also consider application such

as smart and interactive tour guides, smart shopping guides that lead shoppers

69

based on their location in a store, smart traffic control in parking structures

that guides cars to free parking slots. Department stores, enterprises, hospitals,

manufacturing sites, malls, museums, and campuses are some of the potential

end-users to benefit from the technology.

• Fleet management. Many fleet operators, such as police force, emergency

vehicles, and other services including shuttle and taxi companies, can make

use of the wireless location technology to track and operate their vehicles in

an efficient way in order to minimize the response time. In addition, a large

number of drivers on roads and highways carry cellular phones while driving.

The wireless location technology can help track these phones, thus transforming

them into sources of real-time traffic information that can be used to enhance

transportation safety.

• Location-based wireless security. New location-based wireless security

schemes can be developed to add a level of security to wireless networks against

being intercepted or hacked into. By using location information, only people at

certain specific areas could access certain files or databases through a WLAN.

• Location sensitive billing. Using the location information of wireless users,

wireless service providers can offer variable-rate call plans or services that are

based on the caller location.

70

3.1.3 Review of Data Fusion Methods

We assume that the location is speciﬁed by (x, y) for simplicity. As shown in Figure

3.1, data fusion center is to determine the mobile user location by exploring all the

estimated signal parameters from BSs. The most common signal parameters are

time, angle and amplitude of arrival of the MS signal. Therefore, diﬀerent data

fusion algorithms are proposed accordingly. The materials in this section are mainly

based on the survey paper in [53].

• Time. By combining the estimates of the TOA (time of arrival) of the MS

signal when received at the BSs, the MS location can be determined in a wireless

network with three or more BSs. It is illustrated in Figure 3.3. Without loss of

BS 3

BS 2
( x3 , y3 )
( x2 , y 2 )
r3 r2

( xT , yT )
MS

r1

BS1
(0, 0)

Figure 3.2: TOA/TDOA data fusion using three BSs

generality, the geometric coordinate of BS1 is assumed to be (0, 0). The location

71

of other BSs are denoted by (xk , yk ), k = 2, 3. Obviously x1 = y1 = 0. Since the

radio signal travels at the speed of light (c = 3 × 108 m/s), the distance between

the MS and BSk is given by

Rk,T = (tk − to )c, (3.1)

where to is the time instant when the MS starts transmitting signal and tk is the

time of arrival of the MS signal at BSk . The distances {Rk,T }3 can be used
k=1

to estimate the MS location (xT , yT ) by solving the following set of equations

R1,T = x2 + yT
2
T
2

2
R2,T = (x2 − xT )2 + (y2 − yT )2 (3.2)
2
R3,T = (x3 − xT )2 + (y3 − yT )2 .
To solve the above overdetermined nonlinear system of equations, we can refor-

mulate (3.2) into an LS-type presentation by subtracting the ﬁrst equation from

the second and the third equations respectively. Hence the following equation

is obtained
2 2
R2,T − R1,T = x2 + y2 − 2(x2 xT + y2 yT )
2
2
(3.3)
2 2
R3,T − R1,T = x2 + y3 − 2(x3 xT + y3 yT ).
3
2

In a matrix form, it can be rewritten as
    
2 2 2
 x2 y2   xT  1  − R2 −(R2,T R1,T )
  =  , (3.4)
x3 y3 yT 2 R2 − (R2 − R2 )
3 3,T 1,T

where Rk = x2 + yk is the distance of the base station BSk to the origin point
k
2

2
in the coordinate, and clearly R1 = 0. If we have more than three BSs, a

compact form can be obtained in a similar way as

b = Aθ, (3.5)

72

where
   
2 2 2
 R2 − (R2,T − R1,T )
  x2 y 2   
   
 2 2 
2  
 R3 − (R3,T − R1,T )
  x3 y 3  x
1

b= 2 ; A =  ; θ =  T .
 
  
 R2 − (R2 − R2 )   x y4  yT
 4 4,T 1,T   4 
 .   . 
.
. .
.

A standard LS estimation of θ is given by

ˆ
θ = (AT A)−1 AT b. (3.6)

2
Note that R1,T is a function of xT and yT as defined in 3.2. Hence (3.6) only

provides an intermediate solution and the estimates xT and yT can be obtained
ˆ ˆ

by solving the resultant quadratic equation. And clearly the TOA data fusion

method requires perfect timing between the MS and the BSs since a small offset

of a few microseconds between the MS clock and the BS clock will reflect into

hundreds of meters of errors in location estimate. But the current wireless

network standards only mandate tight timing synchronization among BSs [62].

The accuracy of TOA method is heavily dependent on the timing between BS

and MS. There is another alternative of using the TDOA (time difference of

arrivals) which help avoid the MS clock synchronization problem. Define the

TDOA associated with the base station BSk as ∆tk,1 = tk −t1 , i.e., the difference

between the TOA of the MS signal at the BS BSk and BS1 . Then the difference

between Rk,T and R1,T can be related to ∆tk,1 as

∆Rk,1 = Rk,T − R1,T
= (tk − to )c − (t1 − to )c (3.7)
= ∆tk,1 c.

73

Clearly it is seen that the possible timing error on the MS clock to is canceled

out. This insensitivity to to gives TDOA method the advantage over TOA. By

substituting Rk,T = (R1,T + ∆Rk,1 )2 in (3.2) and rearranging some terms, we
2

can obtain the following LS expression for any number of base stations as

R1,T c + d = Aθ, (3.8)

where    
2 2
 −∆R2,1   − R2 ∆R2,1 
   
   2 2 
 −∆R3,1   R − ∆R3,1 
c= ; d = 1  3 .
  2 
 −∆R   R2 − ∆R2 
 4,1   4 4,1 
 .   . 
. . . .
Notice that R1,T = x2 +yT is not known and hence only an intermediate solution
T
2

can be obtained from the above LS formulation

ˆ
θ = (AT A)−1 AT (R1,T c + d). (3.9)

Since
ˆ 2 2
θ = R1,T , (3.10)

we can substitute (3.9) into (3.10) and solve R1,T from the resulting quadratic
ˆ
equation. A ﬁnal solution for θ can be subsequently obtained by substitute the

positive root of the quadratic equation into (3.9).

• Angle. The AOA (angle of arrival) can be obtained at a BS by using an an-

tenna array. The direction of arrival of the MS signal can be calculated by

measuring the phase diﬀerence between the antenna array elements or by mea-

suring the power spectral density across the antenna array in what is known

74

as beamforming [64]. Intuitively, the MS location can be estimated by com-

bining the AOA estimates from two BSs as shown in Figure 3.3. Compared

BS2
( x2 , y 2 )
2

R2 R2,T
MS
( xT , yT )
R1,T
1
1

(0,0) BS1

Figure 3.3: AOA data fusion with two BSs

to TOA/TDOA methods, the number of BSs needed for location is relatively

smaller and there is no need for timing synchronization between base stations

and MS clocks. However, one disadvantage is that antenna array used at the

BS which is not available in 2G systems. It is planned for 3G cellular systems

such as UMTS and CDMA2000 [65, 66]. As indicated in Figure 3.3, we have
         
 xT   R1,T cos(β1 )   xT   x2   R2,T cos(β2 ) 
 = ;  = + , (3.11)
yT R1,T sin(β1 ) yT y2 R2,T sin(β2 )

where

2 2
R2,T = R1,T + R2 − 2R1,T R2 cos(α1 − β1 ) = f (α1 , β1 , R1,T , R2 ).

75

Since α1 , β1 and R2 is known, we simply denote R2,T as a function of R1,T as

R2,T = f2 (R1,T ). If there are more than two BSs, an LS formulation can be

obtained by collecting the relations in (3.11) into a single equation as

b = Aθ, (3.12)

where    
 R1,T cos(β1 )   1 0 
   
   


R1,T sin(β1 ) 

 0 1 
 
   
b =  R2 + f2 (R1,T ) cos(β2 )

; A =  1 0 .
  
   
   
 R2 + f2 (R1,T ) sin(β2 )   0 1 
   
 .   . 
.
. .
.
The LS solution for x is then

ˆ
θ = (AT A)−1 AT b. (3.13)

Since this intermediate solution involves the unknown R1,T , we have to utilize

the relation in (3.10) to get the positive root of the quadratic equation and then
ˆ
substitute R1,T back to (3.13) for a ﬁnal solution of θ.

• Amplitude. Amplitude-based wireless location technology is mainly used in

indoor environments where WLAN standards such as 802.11a and 802.11g have

been widely adopted. The WLAN connectivity has also become a standard

feature for laptop computers and PDAs. As such, there is an increasing interest

in utilizing these networks for location purposes to help provide a good coverage

for indoor scenario. In 802.11b and 802.11g MAC layer, the information about

76

the signal strength and the signal-to-noise ratio is provided. Hence, a software-

level location technique could be developed for WLAN networks based on the

amplitude of arrival of the MS signal at different access points [67, 68, 69].

Specifically, when an IEEE 802.11 networks operate in the infrastructure mode,

there are several APs (access point) and many end users within the network.

RF-based systems that use the signal strength for location purposes can monitor

the received signal strength from different APs and use the obtained statistics

to build a conditional probability distribution network in order to estimate the

location of the mobile client. These schemes usually work in two stages. The

first stage is the offline training and data gathering phase and the second stage is

the location determination phase using the online signal strength measurements.

In the training phase, signal strength measurements are used to build an a priori

probability distribution of the received signal strength at the mobile user from

all APs. Assume there are Na APs in the system and the radio map is created

based on measurements from Nu user locations. It is illustrated in Figure 3.4.

The radio map model is described by [67]. Define p(Ai | xj , yj ) as the probability

density function of the received signal strength from the i-th AP at the j-th

measurement point (xj , yj ). After constructing a Bayesian network, the online

determination phase uses maximum likelyhood estimation to locate the mobile

user. Thus assume that the mobile user measures the received signal strength

from all APs, say Ai , i = 1, 2, . . . , Na . Then by Bayes’ rule, the probability of

77

p( Ai | x4 , y4 )

AP1
p ( Ai | x3 , y3 ) p ( Ai | x5 , y 5 )

AP6
p( Ai | x6 , y6 )

p( Ai | x2 , y2 )
AP7 p( Ai | x7 , y7 )

p ( Ai | x1 , y1 ) AP5
AP2 AP4

p ( Ai | x10 , y10 )
p ( Ai | x8 , y8 )

AP3 p ( Ai | x9 , y9 )

Figure 3.4: Magnitude-based data fusion in WLAN networks

having the mobile user at location (xj , yj ) given the received signal strengths

from all APs is given by

A = [A1 , . . . , ANa ]T
p(A|xj ,yj )p(xj ,yj )
p(xj , yj | A) = p(A)
(3.14)
Na
p(xj ,yj ) i=1 p(Ai |xj ,yj )
= p(A)
,

Na
where i=1 p(Ai | xj , yj ), 1 ≤ j ≤ Nu is the approximation for the conditional

probability density function of the received signal strength when the location of

the mobile is given. Thus the location of the mobile user can be estimated as

(ˆT , yT ) = arg max p(xj , yj | A)
x ˆ 1 ≤ j ≤ Nu . (3.15)
xj ,yj

We note that the location problem has been tackled by the LS approach as above.

See also [53] for more details. However several problems exist. The ﬁrst one is that

78

it is unclear for the physical meaning of these LS solutions, because of the lack of

the statistical information on the measurements of the TOAs, TDOAs, AOAs and

amplitudes, and the impact in transforming the nonlinear estimation for wireless

location into quasi-linear estimation. This problem will be investigated in this thesis

for location based on TDOAs and AOAs. The second one is the nuisance variables

Rk,T , the distance from the k-th BS to the MS which is really unknown. Although

we can use roots solving method, it works only if no noise is involved in measurement

data and often no positive real roots exist. We will convert it into a constrained LS

problem and provide a solution algorithm in this thesis. The ﬁnal problem is location

using more than one type of measurements. Because of the timing diﬃculty and lack

of training, we will consider only measurements of TDOAs and AOAs for wireless

location.

3.2 Least-square Location based on TDOA/AOA
Estimates
3.2.1 Mathematical Preparations

Estimation problem, simply speaking, is to guess what you do not know base on

what is given to you. In terms of its mathematical fundamentals, it is to estimate the

unknown parameters based on some observation data by using some criteria which

leads to an optimal estimator. The observation data usually is a function of the

unknown parameters, either a linear function or a nonlinear one. For simplicity, let’s

79

begin with a generic linear model as follows:

Z = Hθ + V . (3.16)

In this model, Z, of size N × 1, is called the measurement vector ; θ, of size n × 1, is

called the parameter vector ; H, of size N × n, is called the observation matrix and V ,

the same size as Z, is called the measurement noise vector. Because V is random,Z is

random too. Both H and θ can be either deterministic or random. This is determined

by the speciﬁc applications. Because of the simplicity, linear models are widely used

in practice. Even in the case of nonlinear models, quasi-linear models that are close

to nonlinear models are often pursued as in this thesis.

Here a question follows the linear model above: “How can we have the best esti-

mate of θ if we only know Z?” This can be viewed as that we have made N times

of independent experiments in order to estimate θ, which is composed of n unknown

elements {θ1 , θ2 , . . . , θn }, where n < N . Inevitably, the experiment data is corrupted

by some noise which is usually assumed to be additive Gaussian. To answer the

question, there are generally three types of criteria to seek for the best estimate of

θ in the ﬁeld of statistical signal processing. They are weighted least-square estima-

tion (WLSE), minimum mean square estimation (MMSE) and maximum-likelihood

estimation (MLE).

ˆ
1. WLSE: It is the simplest method with the oldest history. The best estimate θ

can be obtained by minimizing the cost function

ˆ ˆ ˆ
J[θ] = [Z − Hθ]T W[Z − Hθ], (3.17)

80

where W = WT > 0 is the weighting matrix.

2. MMSE: The optimal estimate minimizes the error variance. Given the mea-

surements {Z(i)}N , we shall determine an estimate of θ
i=1

ˆ
θ = f [z(1), z(2), . . . , z(N )] (3.18)

such that the mean squared error

ˆ ˆ ˆ
J[θ] = E[θ − θ]T [θ − θ] (3.19)

is minimized.

3. MLE: It aims to maximize the likelihood function. Suppose that the measure-
ˆ
ment data {Z(i)}N are jointly distributed with a density function p(Z; θ). The
i=1

optimal estimate is given by

ˆ ˆ
θopt = arg max p(Z; θ). (3.20)
ˆ
θ

ˆ
It is usually a nonlinear estimator since the likelihood function p(Z; θ) is non-

linear with respect to θ(k). Hence the computational load could be high.

Then, how do we know whether or not the result obtained from one particular method

is good? Or why is it better than other methods? We learn that, to answer this

question, we must make use of the fact that all estimators represent transformations

of random data and hence the estimate itself is random so that its properties must be

studied from a statistical viewpoint. In this section, we introduce some fundamental

81

concepts such as unbiased estimator and efficient estimator, Cramer-Rao bound and

Fisher information matrix [72].

Definition 3.1 (Unbiasedness [72] ) Suppose that the parameter vector θ is deter-
ˆ ˆ
ministic. An estimator θ is unbiased if E{θ} = θ.

An unbiased estimate indicates that its mean value is the same as the true parameter

vector. Hence as the number of observation increases, the estimate is assured to

converge to the true parameter. However the unbiasedness itself is not adequate. We

must study the dispersion about the mean, the variance of the estimator. Ideally,

we would like our estimator to be unbiased and to have the smallest possible error

variance.

ˆ
Definition 3.2 (Efficiency [72] ) An unbiased estimate, θ of vector θ is said to be
˜
more efficient than any other unbiased estimator, θ, of θ, if

ˆ ˆ ˜ ˜
E{[θ − θ][θ − θ]T } ≤ E{[θ − θ][θ − θ]T }. (3.21)

A more efficient estimator has the smallest error covariance among all the unbiased
ˆ ˆ ˜ ˜
estimators of θ, “smallest” in the sense that E{[θ − θ][θ − θ]T } − E{[θ − θ][θ − θ]T }

is negative semidefinite. Normally it does not make much sense to compare each

pair of unbiased estimators. A lower bound, called CRB (Cramer-Rao Bound), about

the minimum error variance achievable over all unbiased estimates exists and the

efficiency of an unbiased estimator can be used to measure by how close it is to the

CRB. The following theorem presents the CRB.

82

Theorem 3.1 (Cramer-Rao Bound [72] ) Let Z denote a set of N observation data,

i.e., Z = [z(1), z(2), . . . , z(N )]T which is characterized by the probability density func-
ˆ
tion p(Z; θ) = p(Z). If θ is an unbiased estimate of the deterministic θ, then the error
ˆ ˆ
convariance matrix, E{[θ − θ(k)][θ − θ(k)]T }, is bounded from below by

ˆ ˆ
E{[θ − θ(k)][θ − θ(k)]T } ≥ J−1 , (3.22)

where J is the Fisher information matrix, deﬁned by
 
 ∂ T
∂
J=E ln p(Z(k)) ln p(Z(k)) , (3.23)
 ∂θ ∂θ 

which can also be expressed equally as

∂2
J = −E ln p(Z(k)) . (3.24)
∂θ2

Note that, for the theorem to be applicable, the vector derivatives in (3.23) must

exist and the norm of ∂p(Z)/∂θ must be absolutely integrable. Clearly, to compute

the Cramer-Rao lower bound, we need to know the probability density function p(Z).

Often the exact information on p(Z) is not available, for which we cannot evaluate

this bound. However, in the case of normal distribution, i.e.,

1 [Z−µ]T C −1 [Z−µ]
p(Z; θ) = e− 2 , (3.25)
(2π)N/2 |C| 1/2

where µ and C are, respectively, the mean and the convariance matrix of Z. Then we

can compute the Cramer-Rao bound corresponding to the Gaussian data distribution

by the Slepian-Bangs formula [74]
 
T
1 ∂C −1 ∂C ∂µ ∂µ 
[J−1 ]ij = tr C−1 C + C−1 . (3.26)
2 ∂θi ∂θj ∂θi ∂θj

83

Because of the central limit theorem, Gaussian distribution holds approximately in

applications such as location estimation.

3.2.2 Location based on TDOA

In this section, we investigate location estimation algorithms based on the measure-

ments of TDOA and AOA. For simplicity, we assume that the mobile users travel

at a low speed and can be taken as stationary targets approximately. Hence we do

not consider the estimation of velocity of mobile users. Basically we explore all the

available measurements {∆tk,1 }Nb (TDOA data) and {βk }Nb (AOA data), where Nb
k=2 k=1

is the total number of base stations to determine the location of the mobile user or

the target, i.e., (xT , yT ). It is seen that we consider only two-dimensional localization

that is adequate, if the terrain elevation is known a priori or it could be neglected

compared to the heights of the antenna towers.

We start with stationary target estimation based on the measurements of TDOA.

As deﬁned in section 3.1.3,

Rk,T = (xT − xk )2 + (yT − yk )2
∆tk,1 = (Rk,T − R1,T )/c (3.27)
= ( (xT − xk )2 + (yT − yk )2 − x2 + yT )/c.
T
2

Besides the measurements {∆tk,1 }Nb , the locations of all the base stations {(xk , yk )}Nb
k=2 k=1

are also assumed to be known. Clearly ∆tk,1 is a nonlinear function of the un-

known (xT , yT ), i.e., ∆tk,1 (xT , yT ). Here, for brevity of notation, (xT , yT ) is omitted

in TDOAs unless it is needed for clariﬁcation.

ˆ ˆ ˆ
For all the TDOA measurements {∆t2,1 , ∆t3,1 , . . . , ∆tNb ,1 }, it is unavoidable that

84

there are measurement noises embedded within the data. Therefore, the measurement

data are described by

ˆ
∆tk,1 = ∆tk,1 + δtk , (3.28)

where {δtk }Nb are assumed to be i.i.d. (independent and identical distributed) Gaus-
k=2

2
sian random variables with zero mean and variance σt . It is an important but fair

assumption given the fact that all the BSs are well synchronized and it is much less

likely that a large deviation from the mean occurs. Since δtk is a Gaussian random

ˆ
variable, so is ∆tk,1 . Based on the above assumption, we can define the (Nb − 1) × 1

ˆ
multivariate Gaussian random variable vector ∆t and the associated mean m∆t and
ˆ

covariance matrix M∆t respectively as
ˆ
   
 ˆ
∆t2,1   ∆t2,1 
   
. .
∆t = 
ˆ  .
.

; m∆t = 
ˆ  .
.
 2
 ; M∆t = σt I(Nb −1) .
ˆ (3.29)
   
   
ˆ
∆tNb ,1 ∆tNb ,1

ˆ
As shown in [71], the joint PDF for ∆t is given by

ˆ
p(∆t) = √
1√ ˆ ˆ ˆ
ˆ
exp[− 1 (∆t − m∆t )T M−1 (∆t − m∆t )]
ˆ
( 2π)Nb −1 det M∆t 2 ∆t
ˆ
ˆ
(3.30)
1 Nb (∆tk,1 −∆tk,1 (xT ,yT ))2
= √ N −1 exp[− k=2 2σt2 ].
( 2π)Nb −1 σt b

This joint Gaussian PDF can completely describe the statistical characteristics of the

measurement data and itself is affected by the two unknowns xT and yT . With a

fixed data set of measurements, there must be only one pair of (xT , yT ) such that the

set of data is the most likely to occur. In light of the estimation theory, maximum-

likelihood (ML) method can be explored to estimate the target location (xT , yT ).

Before providing the ML estimator, as shown in Theorem (3.1), we would like to

85

compute the Fisher information matrix and the Cramer-Rao bound such that we

know how close the estimation can be. The Cramer-Rao bound is a benchmark

for evaluating diﬀerent types of unbiased estimators. Let P and JFIM denote the

estimation error convariance matrix and the Fish information matrix. It holds for

any type of unbiased estimator [72] that

P ≥ J−1 .
FIM (3.31)

According to the Slepian-Bangs formula, the Fisher information matrix based on

(3.30) can be calculated by

1 ∂M∆ˆ −1 ∂M∆ˆ ∂m ˆ ∂m ˆ
Jtdoa = [ tr{M−1
∆ˆ ∂χ
t
t
M∆ˆ
t ∂χ
t
} + ( ∆t )T M−1 ( ∆t )]2,2 ,
∆ˆ ∂χ
t i,j=1,1 (3.32)
2 i j ∂χi j

2 2
where χ1 = xT and χ2 = yT . Since M∆t = σt INb −1 is only related to σt , the ﬁrst
ˆ

term in (3.32) is zero. By direct calculations, we have
   
1 xT −x2 xT 1 yT −y2 yT
 (
c R2,T
− ) 
R1,T  (
c R2,T
− R1,T
) 
   
 1 ( xT −x3 − R1,T ) 
xT  1 ( yT −y3 − Ry1,T ) 
T
∂ m∆t   ∂ m∆t  
∂χ1
ˆ
=  c R3,T .

;
 ∂χ2
ˆ
=  c R3,T .

 ; M−1 = 12 IN −1 .
 ˆ σt
 .
.   .
.  ∆t b
   
   
1 xT −xNb xT 1 yT −yNb yT
(
c RNb ,T
− R1,T
) (
c RNb ,T
− R1,T
)
(3.33)

Then it is easy to obtain Jtdoa as the follow by substituting (3.33) into (3.32),
∂ m∆t T ∂m ˆ
Jtdoa = [( ∂χi
ˆ
) M−1 ( ∂χ∆t )]2,2
ˆ
∆t i,j=1,1
 j 
Nb xT −xk xT
1  Rk,T
− R1,T  xT −xk yT −yk yT
= xT
2σ2
  − , −
k=2 c t
yT −yk yT Rk,T R1,T Rk,T R1,T
Rk,T
− R1,T
 
Nb
1  cos(βk ) − cos(β1 ) 
= 2 2
  cos(βk ) − cos(β1 ), sin(βk ) − sin(β1 ) ,
k=2 c σt sin(βk ) − sin(β1 )
(3.34)

86

where {βk }Nb are shown in Figure 3.3 with tan(βk ) = (yT − yk )/(xT − xk ). By taking
k=1

an inverse of the Fisher information matrix Jtdoa , the resultant matrix will be a lower

bound of estimation error covariance for all the unbiased estimators.

In terms of the large-sample property, the ML estimate approaches the Cramer-

Rao bound asymptotically, i.e, with an infinite number of data measurements. From

(3.30), the ML location estimator seeks (xT , yT ) to minimize the log-likelihood func-

tion of the form

Nb 2
L∆t (xT , yT ) = ˆ
c∆tk,1 − (xT − xk )2 + (yT − yk )2 + x2 + y T .
T
2
(3.35)
k=2

This is obtained by using the fact that e−x is a monotonically decreasing function and

scaling with a constant c2 σt does not affect the likelihood function. There are two
2

unknowns in L∆t (xT , yT ), namely xT and yT . Differentiating L∆t (xT , yT ) with respect

to each and equating the resulting partial derivatives to zero gives the following

necessary condition for the optimal solution (x∗ , yT )
T
∗

   
Nb xk −x∗ x∗

T
+ T
  0 

Rk,T
∗
R1,T
∗
ˆ
 (c∆tk,1 − Rk,T + R1,T ) =  . (3.36)
yk −yT yT
k=2
Rk,T
+ R1,T
0

The ML estimator is well studied and widely used in practice, especially in some

applications which require high accuracy of estimation and computational complexity

can be afforded via commercially available hardware and software. It has a variety of

statical properties which is preferred in applications:

• It is unbiased: the expectation of the estimate is equal to the real value;

• It is the most efficient: it achieves the minimum error variance;

87

• It is consistent: it converges to the real value in probability.

Hence it is plausible to apply ML to our estimation problem for the highest pos-

sible accuracy of localization. However, solving the optimal solution (x∗ , yT ) from
T
∗

(3.35) and (3.36) is not easy and involves nonlinear procedures such as Newton-type

algorithms which are not discussed in this dissertation. The maximization of the

likelihood function can be done by hands with some PDFs and even the commercial

software does not guarantee to reach the ML solution because of the possible exis-

tence of the local minimum. In this thesis we take a quasi-linear approach as in [54]

to convert the nonlinear optimization problem into a linear one that leads to an LS-

type problem in order to simplify the solution algorithm. Or we can use the LS-type

solution as an initial solution candidate in the Newton-type iterative algorithms to

ensure the fast convergence to the true ML solution (x∗ , yT ). For this purpose of
T
∗

bypassing the diﬃculty and complexity of the original ML estimator, we notice that

the second equation in (3.27) leads to

2
(xT − xk )2 + (yT − yk )2 = x2 + yT + c∆tk,1
T
2
. (3.37)

By expanding and rearranging the terms, the above can be written as
 
1 2 2  xT  2 2
R = 2
2 k xk yk   + ∆tk,1 + R1,T ∆tk,1 . (3.38)
c c yT c

Packing all the equations in (3.38) for k = 2, 3, . . . , Nb yields
       
2
R2   ∆t2
   x2 y2   2,1   ∆t2,1 
1   2  .  xT   2 
 .
.   .    .   . 
 . = 2 . . .
.  + .
. +  .
.  R1,T . (3.39)
c2   c     c 
    yT    
2
RNb xNb yNb ∆t2 b ,1
N ∆tNb ,1

88

If we have the perfect TDOA information, the target (xT , yT ) is uniquely located

with any 2 out of the Nb − 1 sets of data since it is an over-determined problem.

To estimate the target location (xT , yT ) in (3.39), however, we have to replace the

ˆ
perfect time difference ∆tk,1 with the available TDOA measurements ∆tk,1 . It then

ˆ
introduces a noise vector as follows, since ∆tk,1 = ∆tk,1 + δtk .
       
 η2   δt2   ˆ
∆t2,1 δt2   δt2
2 
  2 .     
 .     .   . 
 . =−  .  R1,T − 2  . + . . (3.40)
 .  c
.   .   . 
       
ηNb δtNb ˆ
∆tNb ,1 δtNb δt2 b
N

Each element of the noise vector is composed of the TDOA measurement noise δtk

and the corresponding squared term. Taking expectation at both sides of (3.40), we

2
find that each element of the noise vector is with mean σt . In an effort to obtain an

LS-type formulation, we define

2 ˆk,1
ak = Rk /c2 − ∆t2 − σt ,
2 ˆ
bk = 2∆tk,1 /c. (3.41)

We can regard {ak } and {bk } as pseudo-measurements that leads to a constrained

linear model:
       
  2
 a2   b2   x2 y2   η2 − σt 
    2  .  xT  
 .   .   .    . 
 . − .  R1,T = 2  . . + . (3.42)
 .   . . . .
 
 c    
      yT  
2
aNb bNb xNb yNb ηNb − σt

where the constraint is R1,T = x2 + yT . It is worth noting that the composite-noise
T
2

{ηk }Nb are not Gaussian random variables or to say, not in normal distribution. But
k=2

if {ηk }Nb are Gaussian then the ML algorithm for location estimation is equivalent
k=2

to a weighted LS problem involving a constraint. As stated in Corollary 11-1 of [72],

89

ML, LS and BLUE (Best Linear Unbiased Estimator) algorithms are all equivalent

for a generic linear model with additive Gaussian noise term. By deﬁning
       
2
 a2   b2   x2 y2   η2 − σt 
       
 .   .   . .   . 
a =  . ;
 . 
b =  . ;
 . 
H1 = c22  .
 .
.
. ;

η1 = 

.
. .

       
2
aNb bNb xNb yNb ηNb − σt
(3.43)

we can rewrite (3.42) into a more compact quasi-linear form:

a − bR1,T = H1 θ + η1 . (3.44)

The above expression is very similar to a generic linear model of the standard LS al-

gorithm except that the pseudo-measurements vector a − bR1,T involves one unknown

R1,T = x2 +2 . Fortunately we have an extra condition that helps to solve R1,T .
T T

H1 is deterministic and η1 is a non-Gaussian vector but whose elements all have zero

2
mean. Let W1 (R1,T ) be a diagonal matrix with elements E{|ηk − σt |2 }. Set

1 T −1
J1 = a − bR1,T − H1 θ W1 (R1,T ) a − bR1,T − H1 θ (3.45)
2

as the objective function to be minimized. Then it is well known that the minimizer

is the ML solution provided that the noise vector is Gaussian with W1 (R1,T ) as the

covariance matrix. The weighted LS solution can be easily obtained as

ˆ
θ = (HT W1 (R1,T )H1 )−1 HT W1 (R1,T ) a − bR1,T = Φ1 (R1,T ) a − bR1,T , (3.46)
1
−1
1
−1

−1 −1 ˆ
where Φ1 (R1,T ) = (HT W1 (R1,T )H1 )−1 HT W1 (R1,T ). Here θ is an intermediate
1 1

solution since R1,T is unknown. By taking norm square on both sides, it yields

2 2
R1,T = Φ1 (R1,T ) a − bR1,T . (3.47)

90

If one of the roots from such a nonlinear equation is real and positive of which the one

yielding the smallest J1 is the optimal solution to the constrained LS problem. It is

commented that we convert the ML estimation problem into an LS-type estimation by

replacing the perfect TDOA information with measurement data and the equivalence

between the LS-type solution and ML estimator can be further established based on

the assumption that the composite noise vector is Gaussian. If the noise vector in

(3.40) is not exactly Gaussian, the constrained LS solution is not the ML solution

either. It seems that we overemphasized the simplicity that LS-type algorithm may

have and sacriﬁced the accuracy of estimation. However it is not too far away from

the true ML solution under some mild conditions as shown below.

Let X be a Gaussian random variable with zero mean and variance σ 2 . Then the

high-order moments of X is given by [73]

E{X 2n } = 1 × 3 × 5 × · · · × (2n − 1)σ 2n ; E{X2n−1 } = 0.

where n > 0 an integer. Let Y = αX + (X 2 − σ 2 ). Then E{Y } = 0 and

σY = E{Y 2 } = α2 σ 2 − σ 4 + E{X 4 } = α2 σ 2 + 2σ 4 = σ 2 (α2 + 2σ 2 ).
2
(3.48)

Gaussian random variables (GRV) admit some nice properties that the summation

of any two GRV is still a GRV and the product of two independent GRV is a GRV

[73]. But we cannot conclude that Y is a GRV since it includes the X 2 term. We

are interested in under what condition Y is close to a GRV. By noting that Y =

91

(X + α/2)2 − (σ 2 + α2 /4), we have

X = −α/2 ± Y + (σ 2 + α2 /4), Y ≥ −(σ 2 + α2 /4). (3.49)

Since Y is a function of the GRV X, its PDF is thus given by
 √ 2 √ 2

 − 12 α−
2
y+(σ 2 +α2 /4) − 12 α+
2
y+(σ 2 +α2 /4) 
1 e2σ e 2σ
pY (y) = √ + , y ≥ −(σ 2 + α2 /4). (3.50)
2πσ 2  2 y + (σ 2 + α2 /4) 2 y + (σ 2 + α2 /4) 

∞
From PDF’s property, there holds −(σ 2 +α2 /4) pY (y) = 1. Interestingly, the integral of

the first term in pY (y) is

1 α
√ 2
− − y+(σ 2 +α2 /4)
2σ 2 2
1 ∞ e
IY = √ dy
2πσ 2 −(σ 2 +α2 /4) 2 y + (σ 2 + α2 /4)
√ 2
−1 ∞ − 1 α
− y+(σ +α /4) α 2 2
= √ e 2σ2 2 d[ − y + (σ 2 + α2 /4)]
2πσ 2 −(σ 2 +α2 /4) 2
−1 −∞ z2 α
= √ e− 2σ2 dz let : z = − y + (σ 2 + α2 /4)
2 α
2πσ 2 2
1 ∞ z2
˜ z
= √ e− 2σ2 d˜
z let : z = −
˜
2π 2σ−α
σ
α
= 1−Q ,
2σ
2
√1 ∞ − x2
where Q(x) = 2π x e
2σ dx is the error function. Hence it is concluded that if

α/σ is sufficiently large, then IY ≈ 1 and thus pY (y) is dominated by the first term.

Intuitively, it can be seen that the second term (X 2 − σ 2 ) in Y will fade out since

its mean is zero and it has a small variance E{(X2 − σ 2 )2 } = 2σ 4 when α/σ is

sufficiently large. It is also easy to see that σY is dominated by α2 σ 2 based on the
2

same assumption. Therefore, the random variable Y = αX + (X 2 − σ 2 ) behaves like

normal distributed, provided that α/σ is sufficiently large. Translating this result to

92

˜
the random variables as in (3.40) with δ tk = −δtk leads to

˜ ˜k
Yk = αk δ tk + (δ t2 − σt ),
2 ˆ
αk = 2(R1,T + c∆tk,1 )/c. (3.51)

Then η1 = [Y2 , Y3 , . . . , YNb ]T is a normally distributed vector, as δtk is Gaussian with

2
mean zero and variance σt . Thus Yk is close to Gaussian provided that αk /σt =

ˆ
2(R1,T + c∆tk,1 )/(cσt ) is suﬃciently large for all k ≥ 2. It is worth noting that

αk 2 ˆ
( ) = (R1,T /c + ∆tk,1 )2 /σt .
2
(3.52)
2σt

The right-hand side of the above equation indicates an approximation to the SNR,

since its numerator represents the recorded signal of the traveling time from the target

2
to the k-th BS and its denominator, σt , is the noise variance. If αk /σt is suﬃciently

large, the variance of Yk is, by (3.48),

σYk = E{Yk2 } = αk σt + 2σt = σt (αk + 2σt )
2 2 2 4 2 2 2
(3.53)

2 2
that is dominated by αk σt . It is emphasized that αk = αk (R1,T ) is a function of R1,T .

Recall that one question is raised in the previous part that how far is the LS-type

solution obtained in (3.46) and (3.47) away from the true ML solution. Here a clear

answer is that the LS-type algorithm approximates to the ML solution well as long

as αk /σ is very large for 2 ≤ k ≤ Nb . Therefore the properties of the ML algorithm

hold approximately.

Before ending this subsection, we would like to compute the Cram´r-Rao bound
e

associated with the weighted LS solution by assuming that {Yk } are normal dis-

93

tributed which holds true approximately under the condition discussed earlier. Re-

call that W1 (R1,T ) in the weighted LS problem is the associated covariance matrix.

Thus E{Yk2 } is its element and the joint probability density function (PDF) for the

pseudo-measurement data {ak } and {bk } in (3.42) is
1 1 T −1
PDF = exp − a − bR1,T − H1 θ W1 (R1,T ) a − bR1,T − H1 θ . (3.54)
(2π)n−1 det[W1 (R1,T )] 2

Note that inside the exponent is exactly J1 with a minus sign. The Fisher information

matrix for the PDF in (3.42) can be computed by using the Slepian-Bangs formula

in (3.32). Here we take the pseudo-measurement vector a as the data vector whose

T
mean vector and convariance matrix are ma = bR1,T + H1 θ and Ma = E{η1 η1 }

respectively. Hence both mean and covariance are functions of (xT , yT ). By some

direct calculations, we have
   
∂ 2 ∂ 2
∂xT
(b2 x2 + yT +
T
2
c2
(x2 xT + y2 yT )) ∂yT
(b2 x2 + y T +
T
2
c2
(x2 xT + y2 yT ))
 .   . 
∂ ma
= .  ∂ ma
= . 
∂xT  .  ∂yT  . 
∂ 2 ∂ 2
(bNb x2 + yT +
2 x2 + y T +
2
 ∂xT T  2 (xNb xT + yNb yT ))
c  ∂yT
(bNb T c2 (xNb xT + yNb yT ))
2 2
x + b2
c2 2
cos(β1 ) y + b2
c2 2
sin(β1 )
 .   . 
= . ; = . .
 .   . 
2 2
x
c2 Nb
+ bNb cos(β1 ) y
c2 Nb
+ bNb sin(β1 )
(3.55)

And since Ma is a diagonal matrix whose k-th diagonal element is E{Yk2 } = σ2 (αk +
2

2σ 2 ) with αk = 2 ˆ
x2 + yT + 2∆tk,1 , then taking the partial derivative of E{Yk2 } with
T
2
c

respect to xT and yT gives

2
4σt αk 2
4σt αk
∂ 2 ∂ 2
∂xT
E{Yk } = c
cos(β1 ); ∂yT
E{Yk } = c
sin(β1 ).

94

It is then straightforward to show that
2
4σt cos(β1 )
∂
xT
Ma = diag{ c
[α2 , α3 , . . . , αNb ]}
∂ 4σ 2 sin(β )
yT
Ma = diag{ t c 1 [α2 , α3 , . . . , αNb ]} (3.56)
M−1 = diag{ σ12 [ α2 +2σ2 , α2 +2σ2 , . . . , α2
a
1 1 1
2 ]}.
t 2 t 3 t Nb +2σt

Now we can calculate the Fisher information matrix via Slepian-Bangs method in

(3.32) as

1 ∂Ma −1 ∂Ma ∂ ma T −1 ∂ ma 2,2
Jtdoa,LS = [ tr{M−1 Ma }+( ) Ma ( )] , (3.57)
2 a
∂χi ∂χj ∂χi ∂χj i,j=1,1

where χ1 = xT and χ2 = yT . By substituting (3.55) and (3.56) into (3.57), the Fisher

information matrix is given by
  T
n 2 2
1  2xk /c + bk cos(β1 )   2xk /c + bk cos(β1 ) 
Jtdoa,LS = 2 2 2   
k=2 σt (αk + 2σt ) 2yk /c2 + bk sin(β1 ) 2yk /c2 + bk sin(β1 )
 
n
 cos(β1 ) 
2
8αk
+ 2 2   cos(β1 ) sin(β1 ) (3.58)
k=2 c2 (αk + 2σt )2 sin(β1 )

The above expression is diﬀerent from Jtdoa in (3.34) no matter how large αk /σt is

and how small σt is. Such a discrepancy is caused by the omission of the second

term in pY (y) in computing the Fisher information matrix. The omitted term in

pY (y) may have negligible value in computing the probability but its derivative can

be signiﬁcant. Moreover no matter how small σt is, it can not be zero that contributes

to this discrepancy.

3.2.3 Location based on AOA

The angle of arrival (AOA) of MS signals at a BS can be obtained by antenna arrays.

Unlike TOA/TDOA based location methods, we do not need to consider timing syn-

95

chronization problems for an AOA based location algorithm. But there are something

in common with TOA/TDOA that we have to fuse either TOA/TDOA or AOA mea-

surements into the triangular relations between the BSs and the mobile users, i.e., the

ˆ
target. Suppose that the AOA measurement data are to be of the form βk = βk + δβk .

Recall that tan(βk ) = (yT −yk )/(xT −xk ). That is, βk = βk (xT , yT ). We again assume

2
that {δβk } are uncorrelated with Gaussian distribution of mean zero and variance σβ .

Its joint PDF is given by
 
Nb
1 1 ˆ 2
p∆β (δβ) = exp − 2
βk − βk (xT , yT )  . (3.59)
k=1 2σβ
Nb
(2π)Nb σ β

Since the AOA measurements are associated with additive Gaussian noise, it is easy

to compute the Fisher information matrix whose inverse matrix is the Cramer-Rao

bound for the covariance matrix of the estimation error. Simply speaking, the larger

the Fisher information matrix, the smaller the estimation error variance. And that

translates into a better estimator in terms of accuracy, provided that it is unbiased.

The Fisher information matrix contains the relative rate (derivative) at which the

probability density function changes with respect to the data. Note that the greater

the expectation of a change is at a give value, say (ˆT , yT ), the easier it is to distinguish
x ˆ

(ˆT , yT ) from neighboring values (locations), and hence the more precisely (xT , yT )
x ˆ

can be estimated at (xT , yT )=(ˆT , yT ). To calcualte the Fisher information matrix,
x ˆ

we still have to use the Slepian-Bangs formula as in (3.32). First, some primary

96

computations are carried out as
∂βk (xT ,yT ) ∂βk (xT ,yT )
∂xT
= ∂
∂xT
tan−1 ( xT −yk )
y
T −xk ∂yT
= ∂
∂yT
tan−1 ( xT −yk )
y
T −xk
(3.60)
= − yR2 k ;
T −y
= xT −xk
2
Rk,T
.
k,T

And we know that the mean vector is mβ = [β1 , β2 , . . . , βNb ]T and the covariance ma-

trix is Mβ = INb . With these primary calculation and results, the Fisher information

matrix of AOA measurements is given by
 
yT − yk
Nb
 −  yT − yk xT − xk
1  Rk,T (xT , yT ) 
Jaoa =  xT − xk  −
σ 2 R (xT , yT )2
k=1 β k,T
Rk,T (xT , yT ) Rk,T (xT , yT )
 Rk,T (xT yT )
,
Nb
1  − sin(βk ) 
=   − sin(βk ) cos(βk ) .
σ 2 R (xT , yT )2
k=1 β k,T cos(βk )
(3.61)

With the information matrix above, we can calculate the Cramer-Rao bound (CRB)easily.

In terms of CRB, ML estimator is the closest one among all the unbiased estimators.

The ML algorithm is to minimize the likelihood function of the following form

Nb
2
L∆β (xT , yT ) = ˆ
βk − βk (xT , yT ) . (3.62)
k=1

Then the necessary condition for (x∗ , yT ) to be ML solution is
T
∗

   
Nb
1  sin(βk )  ˆ ∗ ∗  0 
  βk − βk (xT , yT ) =  . (3.63)
k=1 Rk,T −cos(βk ) 0

No matter how many minimum points the nonlinear likelihood function may have,

the true ML solution (x∗ , yT ) must be one of them such that the partial derivative of
T
∗

L∆β (xT , yT ) with respect to xT and yT at the location (x∗ , yT ) are zeros. Again this
T
∗

is a diﬃcult nonlinear optimization to solve and multiple solutions may exists. Thus

we turn our attention to the LS-type algorithm before solving the ML solution.

97

ˆ ˆ
Recall that the AOA measurements are given by βk = βk + δβk , or δβk = βk − βk .

ˆ
Hence Rk,T sin(δβk ) = Rk,T sin(βk − βk ) and thus

ˆ ˆ
Rk,T sin(δβk ) = Rk,T sin(βk ) cos(βk ) − Rk,T cos(βk ) sin(βk )
(3.64)
ˆ ˆ
= ∆xk sin(βk ) − ∆yk cos(βk ).

where ∆xk = xT − xk , ∆yk = yT − yk , and Rk,T = ∆x2 + ∆yk . It follows that
k
2

ˆ ˆ ˆ ˆ
ϕk = −xk sin(βk ) + yk cos(βk ) = −xT sin(βk ) + yT cos(βk ) + Rk,T sin(δβk ). (3.65)

We can regard ϕk as a pseudo-measurement constituting of the real measurements

ˆ
data βk and the known BS location (xk , yk ). For the term Rk,T sin(δβk ) at the right

side of equation (3.65), we argue that even though {sin(δβk )} are not Gaussian, they

2
are close to Gaussian distributed provided that the variance σβ is adequately small

by the fact that with z = sin(δβ) [73],
2 2
∞ exp − 2σ2 sin−1 (z) + 2kπ
1
+ exp − 2σ2 sin−1 (z) + (2k + 1)π
1
β β
pZ (z) =
2
k=−∞ 2πσβ (1 − z2)
(3.66)

2
for |z| ≤ 1 and pZ (z) = 0 for |z| > 1. Since σβ is suﬃciently small, there holds

1
1 1 2 1 − z2
2σ 2
pZ (z) ≈ exp − 2 sin−1 (z) ≈ e β (3.67)
2
2πσβ 2σβ 2
2πσβ

for z ≈ 0. The above implies Rk,T sin(δβk ) will behave like a GRV under the condition

that δβk is very small. This can also be seen in an approximate way that sin (δβk ) ≈

δβk , if δβk is very small. Hence sin (δβk ) and δβk will almost have the same PDF.

We also would like to argue that the probability for |δβ| ≥ π/2 is zero generically.

Otherwise it would imply the wrong direction of the angle of arrival completely. Hence

98

the PDF of δβ has a shape similar to the normal function but tends to zero for |δβ| =

π/2 and beyond that implies that pZ (z) behaves closely to Gaussian distributed.

Even if δβ is normal, the exact variance of sin(δβk ) can be computed as

1 1 1 1 2
E{sin2 (δβk )} = E{1−cos(2δβk )} = − E{ej2δβk +e−j2δβk } = 1 − e−2σβ ≈ σβ
2
2 2 4 2
(3.68)

2
for the case when σβ is suﬃciently small. Now the linear equations in (3.65) are of

the form
     
ˆ ˆ
 ϕ1   − sin(β1 ) cos(β1 )   R1,T sin(δβ1 ) 
      
  
 ϕ2  ˆ
 − sin(β2 ) cos(β2 )   xT 
ˆ  
 R2,T sin(δβ2 )


 =  + , (3.69)
 .   .   . 
 .   .  y  . 
 .   .  T  . 
     
ϕNb ˆ ˆ
− sin(βNb ) cos(βNb ) RNb ,T sin(δβNb )

The noise vector on the right hand side is denoted by

T
η2 = R1,T sin(δβ1 ) R2,T sin(δβ2 ) . . . RNb ,T sin(δβNb ) .

It has mean zero and covariance matrix W2 (R1,T ) that is diagonal with the k-th

element
2
Rk,T 1 − e−2σβ /2 ≈ Rk,T σβ = [(xT − xk )2 + (yT − yk )2 ]σβ .
2 2 2 2
(3.70)

With the Gaussian assumption on the noise vector and {ϕk } as pseudo-measurements,

(3.69) has the form

1 T −1
ϕ = H2 θ + η2 =⇒ J2 = ϕ − H2 θ W2 (R1,T ) ϕ − H2 θ (3.71)
2

is the objective function. Minimization of J2 corresponds to the ML algorithm. The

99

ML solution is given by

ˆ −1
θ = HT W2 (R1,T )H2
2
−1
HT W2 (R1,T )ϕ.
2
−1
(3.72)

−1
However W2 (R1,T ) involves the unknown (xT , yT ) and R1,T = x2 + yT , the above
T
2

does not give the ML solution explicitly. It is interesting to notice that the weighted

LS problem in this subsection is again a constrained LS-type problem. Indeed by

noting that

Rk,T = x2 + yk + x2 + yT − 2(xk xT + yk yT ) = Rk + RT − 2(xk xT + yk yT ),
2
k
2
T
2 2 2
(3.73)

we can multiply (3.72) from left by xk y k for k = 2, · · · , Nb to arrive at

−1
ρk,T := xk xT + yk yT = xk yk HT W2 (R1,T )H2
2
−1
HT W2 (R1,T ))ϕ.
2
−1
(3.74)

2 2 2
In addition Rk,T = Rk + RT − 2ρk,T . Thus taking norm square on both sides of (3.72)

yields

−1
R1,T = Φ2 (R1,T )ϕ 2 ,
2
Φ2 (R1,T ) = HT W2 (R1,T )H2
2
−1
HT W2 (R1,T ). (3.75)
2
−1

Consequently we have Nb equations with Nb unknowns {Rk,T }Nb and R1,T . Although
k=2

these are nonlinear equations, they can be manipulated to solve at least one set of

solutions for these Nb unknowns. These solutions can be substituted back to (3.72)

to yield the ML solution (xT , yT ). It is commented that for large Nb , the complexity

for quasi-linear localization based on AOAs is much higher than the corresponding

localization based on TDOAs. But if we have additional information on TDOAs,

100

then the complexity can be reduced tremendously that will be studied in the next

subsection.

Before ending this subsection we present the Fisher information matrix associated

with the LS-type problem as posed in (3.69). With the assumption on Gaussian

distribution for the noise vector η2 , the joint PDF has the expression

1 1 T −1
PDF = exp − ϕ − H2 θ W2 (R1,T ) ϕ − H2 θ (3.76)
(2π)n det[W2 (R1,T )] 2

where ϕ can be regarded as pseudo-measurement vector. Hence H2 θ is the mean

vector and W2 (R1,T ) is the covariance matrix. An application of the Slepian and

Bangs formula gives the corresponding Fisher information matrix:
 
n
2  cos(βk ) 
Jaoa,LS = 2   cos(βk ) sin(βk ) (3.77)
k=1 Rk,T sin(βk )
 
n ˆ
1  − sin(βk ) 
+ 2 2   ˆ ˆ
− sin(βk ) cos(βk )
k=1 Rk,T σβ ˆ
cos(βk )
 
n ˆ
1  − sin(βk ) 
≈ 2 2   ˆ ˆ
− sin(βk ) cos(βk ) (3.78)
k=1 Rk,T σβ ˆ
cos(βk )

2
where suﬃciently small σβ is assumed. It is interesting to observe that the above

approximate expression is the same as Jaoa in (3.61) except that {βk } are replaced

ˆ
by {βk }.

3.2.4 Location based on both TDOA and AOA

After discussing the location techniques based on either TDOA or AOA measurements

in the previous two sections, we now assume that both AOAs and TDOAs are available

101

for target localization. Though it indicates more information and data are needed and

consequently costs are increased for a location system, the improved accuracy may

pay oﬀ all the expense. Hence it is meaningful to study the location method based

on a combination of TDOA/AOA in the case of redundant information available and

high location resolution mandated. Assuming the independence of the noises (δtk and

δβk ) in measuring the TDOAs and AOAs, the joint PDF is
 2 2 
Nb ˆ
∆tk,1 − ∆tk,1 (xT , yT ) Nb ˆ
βk − βk (xT , yT ) 

exp− −
2 2 
k=2 2σt k=1 2σβ
p∆ (δt, δβ) = √ N √ (3.79)
N −1 N
(2π) b −1 σ b (2π)Nb σβ b
t

= p∆t (δt)p∆β (δβ).

Because of the independence between {δtk }Nb and {δβk }Nb , the Fisher information
k=2 k=1

matrix has the expression

Jtdoa/aoa = Jtdoa + Jaoa , (3.80)

where Jtdoa and Jaoa are the same as in (3.34) and (3.61), respectively. This can be

easily shown [74] by

∂[ln(p∆ (δt,δβ))] ∂[(ln p∆ (δt,δβ))] T
Jtdoa/aoa = E ∂x ∂x
∂[ln(p∆t (δt))] ∂[ln(p∆β (δβ))] ∂[ln(p∆t (δt))] ∂[ln(p∆β (δβ))] T
=E ∂x
+ ∂x ∂x
+ ∂x
∂[ln(p∆t (δt))] ∂[(ln p∆t (δt))] T ∂[ln(p∆β (δβ))] ∂[(ln p∆β (δβ))] T
=E ∂x ∂x
+E ∂x ∂x

= Jtdoa + Jaoa .
(3.81)

102

With respect to the joint PDF in (3.79), the corresponding likelihood-type function

in this case has the form

Nb Nb
1 ˆ
2 1 ˆ 2
L(xT , yT ) = 2σ2
c∆tk,1 − Rk,T (xT , yT ) + R1,T + 2
βk − βk (xT , yT ) .
k=2 c t k=1 σβ

(3.82)

The ML algorithm seeks the maximum of the above likelihood function. The necessary

condition for it to achieve maximum at (x∗ , yT ) is:
T
∗

   
 0  Nb 1  sin(βk ) 
ˆ
  = k=1 Rk,T   βk − βk (x∗ , yT )
T
∗

0 −cos(βk )
 
xk − x∗T x∗ (3.83)
 + T 
Nb  Rk,T R1,T  ˆ
+ k=2  yk − yT∗
y ∗  c∆tk,1 − Rk,T + R1,T .
 
+ T
Rk,T R1,T

The Newton-type algorithms can be applied to solve the ML solution. Clearly the ML

solution to the above nonlinear equations is hard to compute that may not be a global

maximum for L(xT , yT ). An alternative method is the use of LS-type algorithm as in

the previous two subsections. One possible way is to compute the constrained LS so-
(TDOA) (TDOA) (AOA) (AOA)
lutions (ˆT
x , yT
ˆ ) and (ˆT
x , yT
ˆ ) based on TDOAs and AOAs separately

as in the previous subsections and then combine the two as [53]

(AOA) (TDOA) (AOA) (TDOA)
xT = γxT
ˆ + (1 − γ)ˆT
x , yT = γyT
ˆ + (1 − γ)ˆT
y (3.84)

where 0 < γ < 1. Note that Rk,T = R1,T + c∆tk,T can be used in (3.69) to avoid

computing Nb unknowns with Nb equations. Indeed the noise terms in (3.69) have

103

zero mean and variance

ˆ ˆ
E{Rk,T sin2 (βk )} = E{[R1,T + c∆tk,1 − cδtk ]2 sin2 (βk )} ≈ [(R1,T + c∆tk,1 )2 + c2 σt ]σβ
2 2 2

(3.85)

2
if σβ is suﬃciently small. Hence only one unknown RT is involved and ρk,T are all

eliminated which helps to simplify the computation of the LS-type solution to the

target localization problem based on measurements of AOAs. However the determi-

nation of the optimal value of γ is not easy. Hence we opt to compute the LS-type

solution directly.

Since both AOAs and TDOAs are available, we have the following linear equations:
      
 a − bR1,T   H1   xT   η1 
 =  + . (3.86)
ϕ H2 yT η2

Under the independence assumption for the noises η1 and η2 , we have
      

 η1 
 
 η1 

     W1 (R1,T ) 0 
E   = 0,

W = E   η1 η2  =  
 η   η2  0 W2 (R1,T )
2
(3.87)

where the kth diagonal element of W2 (R1,T ) is the same as in (3.85). By assuming

uncorrelated Gaussian for η1 and η2 , the ML solution to estimation of (xT , yT ) can

be computed through minimization the following objective function:
T −1
1
J1,2 = 2
a − bR1,T − H1 θ W1 (R1,T ) a − bR1,T − H1 θ
T (3.88)
1 −1
+ 2 ϕ − H2 θ W2 (R1,T ) ϕ − H2 θ = J1 + J2 .

Taking derivative of the cost function J1,2 with respect to θ, we have

∂J1,2
= −(a − bR1,T − H1 θ)T W1 H1 − (ϕ − H2 θ)T W2 H2 .
−1 −1
(3.89)
∂θ

104

∂J1,2
By letting ∂θ
= 0, tt can be easily shown that the minimizer to the cost function

J1,2 is given by

xT
ˆ −1 −1 −1 −1 −1
= HT W1 (R1,T )H1 + HT W2 (R1,T )H2
1 2 HT W1 (R1,T ) a − bR1,T + HT W2 (R1,T )ϕ .
1 2
yT
ˆ
(3.90)

Because the above solution involves an unknown R1,T = x2 + yT , we can again take
T
2

norm square both sides to obtain an equation for R1,T ﬁrst, and after computing its

solution, the value of R1,T can be substituted into (3.90) to obtain the solution to

the weighted LS problem. Note that R1,T is a positive real root to some nonlinear

equation. One of the positive real roots corresponds to the constrained LS solution,

which provides an initial guess for the true (nonlinear) ML solution.

It is commented that the optimal solution in (3.90) is not in the form of the convex

combination of the two separate LS-type solutions as in (3.84). Rather it is in the

form      
(TDOA) (AOA)
 xT 
ˆ  xT
ˆ   xT
ˆ 
  = Γ  + (I − Γ)   (3.91)
(TDOA) (AOA)
yT
ˆ yT
ˆ yT
ˆ
where Γ is a matrix. Speciﬁcally the solution in (3.90) can be written as
     
(TDOA) (AOA)
 xT 
ˆ −1 −1  xT
ˆ  −1  xT
ˆ 
  = [A1 + A2 ] [B1 + B2 ] = I + A−1 A2
1 
−1
+ I + A2 A1  
(TDOA) (AOA)
yT
ˆ yT
ˆ yT
ˆ
(3.92)

where
−1
A1 = HT W1 (R1,T )H1 ;
1
−1
A2 = HT W2 (R1,T )1, T 2 ;
2
−1
B1 = HT W1 (R1,T ) a − bR1,T ;
1
−1
B2 = HT W2 (R1,T )ϕ.
2

105

Hence A−1 B1 and A−1 B2 are the LS-type solution based on TDOAs and AOAs,
1 2

respectively. Now it is straightforward to show that

−1 −1
I + A−1 A2
1 + I + A−1 A1
2 = [A1 + A2 ]−1 A1 + [A1 + A2 ]−1 A2 = Γ + [I − Γ] = I.

(3.93)

Even though the LS solution in (3.90) is some kind of combination of the two separate

LS solutions in (3.46) and (3.72), the unknown R1,T has to be computed based on

(3.90).

Finally the Fisher information matrix associated with the linear model in (3.86)

is
  T
Nb 2 2
1  2xk /c + bk cos(β1 )   2xk /c + bk cos(β1 ) 
Ptdoa/aoa−f im,LS = 2 2 2   
k=2 σt (αk + 2σt ) 2yk /c2 + bk sin(β1 ) 2yk /c2 + bk sin(β1 )
 
Nb
 cos(β1 ) 
2
8αk
+ 2 2   cos(β1 ) sin(β1 )
k=2 c2 (αk + 2σt )2 sin(β1 )
 
Nb ˆ
1  − sin(βk ) 
+ 2 2   ˆ ˆ
− sin(βk ) cos(βk )
k=1 Rk,T σβ ˆ
cos(βk )
(3.94)

2 2
under the uncorrelated Gaussian assumption and suﬃciently small σt and σβ .

3.3 Constrained Least-square Optimization
ˆ
As shown in 3.46, the weighted LS solution θ is constrained by

R1,T = Φ1 [a − bR1,T ] 2 ,
2 2
(3.95)

from which some solutions R1,T can be solved. If there exist real solutions R1,T , they

can be substituted back into J1 in 3.45 and obtain the optimal solution R1,T based on

106

ˆ
which the optimal solution θ can be obtained. While this holds, (3.95) may not admit

a real solution R1,T due to the existence of noises in observations. More speciﬁcally

(3.95) is equivalent to the quadratic equation

(bT ΦT Φb − 1)R1,T − 2aT ΦT ΦbR1,T + aT ΦT Φa = 0,
2
(3.96)

which admit real solution, if and only if

(aT ΦT Φb)2 + aT ΦT Φa − (aT ΦT Φa)(bT ΦT Φb) ≥ 0. (3.97)

That is, (3.95) admits a real solution R1,T if and only if (3.97) holds. Simulation

in [54] shows that the location estimate in (3.46) is very accurate if the condition

(3.97) holds; Otherwise the location estimate is far away from the true location. The

question is what if (3.97) does not hold which is generically true due to the existence

of noise in the TDOA and AOA measurements.

Let us examine (3.45) again by rewriting J1 into
  T   
1  pT  −1   pT 
J1 = a − H1 b   W1 a − H1 b   , (3.98)
2 R1,T R1,T

where pT = [ xT yT ]T . The nonlinear estimation problem aims to search pT and
ˆ ˆ

R1,T such that J1 is minimized, subject to the constraint R1,T = pT . Denote

Σ = W1 and
   
pT   −I2 0 
A= H1 b , ϕ = a, θ = 
 , Q =  .
R1,T 0 1
Then we have the following more general constrained LS optimization problem:

1
min J1 , J1 = 2 (Aθ − ϕ)T Σ−1 (Aθ − ϕ). (3.99)
θ T Qθ=0

107

We will develop a solution algorithm to such a constrained LS optimization problem

in the following. Assume that Σ is positive deﬁnite, A has full column rank and Q

is nonsingular that has both positive and negative eigenvalues, i.e., Q is indeﬁnite.

We employ Lagrange multiplier to develop the solution algorithm. Let λ be real and

consider
1
J= (Aθ − ϕ)T Σ−1 (Aθ − ϕ) + λθT Qθ . (3.100)
2

Then the necessary condition for optimality yields the condition

AT Σ−1 [Aθ − ϕ] + λQθ = 0 ⇔ θ = [AT Σ−1 A + λQ]−1 AT Σ−1 ϕ. (3.101)

An optimal solution needs to satisfy the constraint θT Qθ = 0 leading to

ϕT Σ−1 A[AT Σ−1 A + λQ]−1 Q[AT Σ−1 A + λQ]−1 AT Σ−1 ϕ = 0. (3.102)

The solution algorithm hinges to the computation of the real root λ from the above

equation and there can be more than one such real root. We employ the result of

simultaneous diagonalization. Because Σ = ΣT > 0 and Q = QT > 0, there exists a

nonsingular matrix S such that AT Σ−1 A = SDΣ ST and Q = SDQ ST where DΣ and

DQ are both diagonal. It is noted that DΣ and DQ have the same inertia as Σ and

Q, respectively. It follows that (3.102) is equivalent to

(S−1 AT Σ−1 ϕ)T (λI + DΣ D−1 )−1 D−1 (λI + DΣ D−1 )−1 (S−1 AT Σ−1 ϕ) = 0.
Q Q Q (3.103)

Let D−1 = diag(q1 , q2 , . . . , ql ) with l×l the size of Q. Then it has the same number of
Q

negative and positive elements as D = DΣ D−1 = diag(d1 , d2 , . . . , dl ) by the positivity
Q

108

of Σ and DΣ . In fact, qi di > 0. The matrices S and D can be obtained by eigenvalue

decomposition of AT S−1 AQ−1 = SDS−1 . Let vi be the i-th element of S−1 AT Σ−1 ϕ.

Then (3.103) is converted into the following:
l 2
−1 T −1 T qi vi
(S A Σ ϕ) (λI+DΣ D−1 )−1 D−1 (λI+DΣ D−1 )−1 (S−1 AT Σ−1 ϕ)
Q Q Q = 2
= 0.
i=1 (λ + di )

(3.104)

We comment that the above has real roots by examining the summation at λ ≈ −di

and by the fact that {qi } have both positive and negative values but not zero. Recall

the assumption on Q. However there are only ﬁnitely many real λ values satisfying

(3.104), which are denoted by {λk }. Now by (3.101),

Aθ − ϕ = [A(AT Σ−1 A + λk Q)−1 AT Σ−1 − I]ϕ
= (AQ)−1 (λk I + AT Σ−1 (AQ)−1 )−1 AT Σ−1 − I ϕ
= (λk I + AQ−1 AT Σ−1 )−1 AQ)−1 AT Σ−1 − I ϕ (3.105)
= −λk (λk I + AQ−1 AT Σ−1 )−1 ϕ
= −λk Σ(λk Σ + AQ−1 AT )−1 ϕ.
Substituting the above into the performance index J in (3.100) leads to

2J = λ2 ϕT (λk Σ + AQ−1 AT )−1 Σ(λk Σ + AQ−1 AT )−1 ϕ.
k (3.106)

Let λopt be the value that minimizes J over {λk }. Then in light of (3.101), the optimal
k

θ is obtained as

θopt = [AT Σ−1 A + λopt Q]−1 AT Σ−1 ϕ.
k (3.107)

To facilitate the MATLAB programming in simulation for roots computation we can

convert (3.104) to
l
2
qi vi (λ + dk )2 = 0. (3.108)
i=1 k=i

109

Obviously the solution algorithm above is developed for location estimation with

TDOA measurements available only. If both TDOA and AOA measurements are

collected, as discussed in the previous section, the extra redundancy indicates an

improved accuracy. According to (3.86), we can formulate it into a similar constrained

LS optimization problem. Denote
         
 W1   H1 b   pT   a   −I2 0 
Σ= ; A =  ;θ =  ;ϕ =  ;Q =  .
W2 H2 0 R1,T φ 0 1
(3.109)

Then we can use the same Lagrange multiplier method to give a solution.

110

3.4 Simulations

In this section, we present a set of simulation results that demonstrate the perfor-

mance of our proposed estimation algorithm.

In the simulation, there are nine base stations which are equally spaced around

a circle. In real WiMax system, the base stations may not exactly locate on a cir-

cle. This is simply for ease of presentation and it is not necessarily required in our

algorithm which is applicable to any geographical distribution of any number of base

stations. To test the accuracy of our location method, ten positions for the mobile

user are chosen and they are distributed around a smaller circle too. For the same

purpose of an easy demonstration, the above assumption about the MS route is made.

The conﬁguration is shown in Figure 3.5.

4
x 10
1
BS4
0.8
BS5 BS3
0.6
MS4
MS3
0.4
MS5 MS2
0.2
y: in meters

BS6 MS1 BS2
0 MS6
BS1
−0.2
MS7
MS10
−0.4
MS8 MS9
−0.6
BS7 BS9
−0.8
BS8
−1
−1 −0.5 0 0.5 1
x: in meters x 10
4

Figure 3.5: Base stations and mobile user locations

111

The base stations are at BS1 = [0, 0]T , BS2 = [32000, 0]T , BS3 = [22627, 22627]T ,

BS4 = [0, 32000]T , BS5 = [−22627, 22627]T , BS6 = [−32000, 0]T ,

BS7 = [−22627, −22627]T , BS8 = [0, −32000]T , BS9 = [22627, −22627]T . The unit

is in meters. For each MS position, total number of 2000 diﬀerent data sets are run

and the MS location is obtained by averaging over all the 2000 estimates.

In the experiments, our location algorithm is simulated for TDOA data only and

for a combination of AOA and TDOA data, respectively. In Figure 3.6, the green

line is the result from a combination of AOA and TDOA data when the SNR’s are

SNRtdoa = 20dB and SNRaoa = 20dB respectively. It almost merges with the blue

line which represents the real MS positions and is invisible in the ﬁgure. It shows the

high accuracy of the estimation algorithm we propose in this thesis. With the same

SNRtdoa = 20dB, the cyan line is the estimation result from the TDOA data only. It

can be seen that there is small deviation from the real position. Intuitively, with the

extra information from AOA measurement, the result in the green line is expected to

be closer to the real positions. From the Fisher information matrices we calculated

in the previous sections, the Cramer-Rao bound for the combination data of TDOA

and AOA should be smaller than that of TDOA data only.

To have a closer look at the performance of the proposed algorithm, we calculate

the approximate mean and standard deviation of the estimation error, i.e., the dis-

tance between the real MS position and the estimated position. It is obtained from a

sample space of 2000 data points. In Figure 3.7, the average estimation error is less

112

4
x 10
1
BS
0.8 Known
TDOA
0.6 AOA+TDOA

0.4

0.2
y: in meters

0

−0.2

−0.4

−0.6

−0.8

−1
−1 −0.5 0 0.5 1
x: in meters x 10
4

Figure 3.6: Location estimation with TDOA-only and AOA+TDOA data

than 4 meters for all the ten MS locations when the TDOA data is of high SNR. To

study the eﬀect of SNR on the performance of the proposed location algorithm, the

MS position at MS2 is randomly selected and the mean and the standard deviation

of the estimation error vary with SNR as shown in Figure 3.8. It is easily seen that at

a low SNR, the estimation is not accurate enough and it is because our assumption

about the measurement noise variance is not valid.

According to the FCC regulations, it requires that for 67% of the E911 calls,

the wireless service providers must provide an estimated location with location error

below 100m. As shown in Figure 3.9, the location error is below 100m for 98% of the

time with SNRtdoa = 40dB. It is well above the requirement from FCC.

From the above ﬁgures, it is demonstrated that the proposed algorithm can provide

113

TDOA
4.5
mean
std

mean and standard deviation (in meters)
4

3.5

3

2.5

2
1 2 3 4 5 6 7 8 9 10
mobile station positions (no unit)

Figure 3.7: Location estimation performance

TDOA
450
mean
400 std
mean and standard deviation (in meters)

350

300

250

200

150

100

50

0
20 25 30 35 40 45 50 55 60
SNR (in dB)

Figure 3.8: Eﬀect of SNR on estimation accuracy

114

AOA+TDOA
100

90

80

70
FCC
Requirement
1−Outrage (%)

60

50

40

30

20

10

0
0 20 40 60 80 100
Location error (meter)

Figure 3.9: Outrage curve for location accuracy

accurate estimation for the MS location. It also meets the FCC requirement for out-

door network-based wireless location.

3.5 Chapter Summary

In this chapter, an introduction about WiMax networks and its IEEE standard evolu-

tion and applications in most aspects is given and the outdoor/indoor wireless location

technologies based on measurements of TOA’s, TDOA’s, AOA’s and amplitudes are

reviewed.

With measurements of TDOA and AOA available, we present a constrained LS-

type algorithm to estimate the target location. The proposed method is diﬀerent

from the commonly used ML algorithm, though the latter is heavily preferred in

115

some applications for its superior performance. Because of the large number of obser-

vation data and the additive measurement noise, maximizing the likelihood function

involves a great amount of computational load. It even does not guarantee that the

optimal estimation can be obtained due to the existence of local minimum. Under

the assumption of zero-mean additive Gaussian noise with a very small variance, the

location estimation problem is formulated into a quasi-linear form, which is solvable

by the LS algorithm. The assumption is usually validated as in [54]. Therefore,

our method holds the preferable properties of the ML algorithm in the sense that it

approaches the Cramer-Rao bound with a large sample of observation data. More

importantly, the computational complexity is reduced by the LS algorithm. As shown
ˆ
in this chapter, the LS algorithm also involves a constraint that θ = R1,T . The

target location can only be obtained by substituting the intermediate LS solution

into the constraint and solving the resultant quadratic equation. It brings complexity

back to the solution. Hence the Lagrange multiplier is explored to solve the above

constrained LS optimization problem. The simulation results show that our scheme

is eﬀective in location estimation.

Chapter 4

Conclusions

This dissertation, in the ﬁrst part, addresses the problem of channel estimation of

MIMO-OFDM systems. It starts from the matrix representation of the signal model

of MIMO-OFDM systems, which clearly describes the relation of signals in frequency

domain and time domain and expressing operations like adding CP and removing

CP as matrix product. From the resulting MIMO-OFDM signal model, a pilot tone

based channel estimation is proposed to estimate the fast time-varying and frequency-

selective fading channel via the least-squares method. The least-squares is selected

for the purpose of low complexity, though some other methods such as MMSE and

ML may produce better estimation performance. To further reduce the computa-

tional complexity, the pilot tone matrix is designed as a unitary matrix to save the

computation of the matrix inversion in the standard LS solution. The pilot tone

matrix is designed in a simple way that Nt disjoint pilot tone sets are placed at one

OFDM block on each transmit antenna. Each pilot tone set has L pilot tones which

are equally-spaced and equally-powered. By choosing the pilot tones based on our de-

116

117

sign, those pilot tones comprise a unitary matrix. For a simple 2 × 2 case, Alamouti’s

orthogonal structure is exploited. And the design can be readily extended to a config-

urable MIMO-OFDM system with any number of transmit and receive antennas. For

a fixed power of pilot tones, our design can be proved to be also optimal in the sense

of achieving the minimum MSE of channel estimation. Compared with some relative

pilot tone designs in the literature, our channel estimation method differs in its ability

to estimate fast time-varying wireless channel since pilot tones are inserted into each

OFDM block, and in its explicit relation with space-frequency code design which can

benefit the channel estimation in return. Seeking for a robust channel estimator with

lower complexity for MIMO-OFDM systems, we are looking at the following aspects

in the future.

• Less overhead loss: It is worth noting that the use of pilot symbols for channel

estimation decrease the spectrum efficiency of the wireless communication sys-

tems. It is a trade off between data throughput and estimation accuracy. It is

of interest to investigate a scheme with even fewer number of pilot tones in each

OFDM block by exploiting some statistical properties of the wireless channel

itself. Intuitively, it is the best balance between overhead loss and estimation

reliability if we can adaptively change the number of pilot tones depending on

the channel condition through some feedback information.

• Joint channel estimation and CFO correction: Usually when we design the chan-

nel estimator, we assume that the OFDM system is perfectly synchronized and

118

there is carrier frequency oﬀset at all. And some CFO compensation algorithms

are also based on the assumption that channel is known at the receiver. It would

be beneﬁcial to combine the channel estimation and CFO compensation into an

integrate algorithm since the performance of either one of the two individual al-

gorithms can be degrade by the invalidity of their assumptions in the real world

OFDM systems. There are already some research work in this area [34, 35], but

more intensive investigation is still needed.

But we still have to consider the data rate loss caused by the pilot-tone overhead

within each OFDM block. We are currently working on this issue with a goal that we

can use a sequence of pilot-tones with length less than the channel length by exploring

its diversity in the time domain.

In the second part of this dissertation, the wireless location on WiMax network

is studied. Similar to the location technology applied to the cellular networks, the

application scenario of locating the mobile user by using some signal parameters

received at the antenna towers is considered. Location estimation methods based on

TDOA, AOA and a combination of TDOA and AOA are presented, respectively. With

the assumption that the measurement noise is zero-mean additive Gaussian noise with

very small variance, the location estimation problem is formulated into a quasi-linear

form. Then the simple LS algorithm can be used to solve the estimation problem,

provided that the noise term in the quasi-linear form is Gaussian. In theory, the ML

algorithm can be directly utilized to estimate the target location since the probability

119

density function of the observation data is known with our assumption. However,

direct use of ML algorithm proves infeasible because of the difficulty of finding the

real roots of a quadratic equation. An alternative to the ML algorithm is required,

which should drastically reduce the complexity of the ML algorithm and provide a

close performance. Our proposed method is such an alternative that it is essentially

a constrained LS-type optimization technique. The approximation of the noise term

in the quasi-linear form to a Gaussian random is also proved in this thesis under the

assumption above. Hence it is concluded that the proposed method can estimate the

target location very accurately, provided that the size of the observation data is large

enough and the equivalent SNR is high. To solve the constrained LS-type optimization

problem, the Lagrange multiplier method is used. It is because that the direct use of

the constraint condition may lead to the same level of complexity for the algorithm

and even positive real roots may not exist in the quadratic equation obtained by

substituting the intermediate LS solution into the constraint. Finally,the extensive

simulation studies has demonstrated the effectiveness of our proposed algorithm.

For future work on wireless location problem, the following aspects are open for

research.

• Large variance: The approximation of the constrained LS-type optimization to

the ML algorithm is dependent on the assumption that the measurement noise

variance is very small, which is usually true. Further research on the case of

120

measurement noise with relatively large variance will improve the robustness of

the proposed algorithm.

• Velocity Estimation: In the thesis, the target is considered stationary by assum-

ing it is moving at a low speed. If the FDOA (frequency diﬀerence of arrivals) of

the received signal is available, then the velocity of the target can be estimated

too. This will extend the range of applications of the proposed algorithm.

Bibliography

[1] Richard Van Nee and Ramjee Prasad, OFDM For Wireless Multimedia Commu-
nications, Artech House Publishers, Norwood MA, 2000.
[2] R. W. Chang, “Synthesis of band-limited orthogonal signals for multichannel
data,” BSTJ., pp. 1775-1797, Dec. 1996.
[3] B. R. Saltzburg, “Performance of an efficient parallel data transmission systems,”
IEEE Trans. on Comm. Tech., pp. 805-811, Dec. 1967.
[4] S. B. Weinstein and P. M. Ebet, “Data transmission by frequency-division multi-
plexing using the discrete Fourier transform,” IEEE Trans. on Commun., COM-
19(5), pp. 628-634, Oct. 1971.
[5] L.J. Cimini, Jr., “Analysis and simulation of a digital mobile channel using or-
thogonal frequency division multiplexing,” IEEE Trans. on Communications.,
vol. 33, pp. 665-675, July 1985.
[6] A. Peled and A. Ruiz, “Frequency domain data transmission usng reduced com-
putational complexity algorithms,” In Proc. IEEE ICASSP, pp. 964-967, Denver,
CO, 1980.
[7] A. Vahlin and N. Holte, “Optimal finite duration pulses for OFDM,” IEEE Trans.
Commun., 44(1), pp. 10-14, Jan. 1996.
[8] B. Le Floch, M. Alard and C. Berrou, “Coded orthogonal frequency-division
multiplexing,” Proc. IEEE, 83(6), pp. 982-996, Jun. 1995.
[9] T. Pollet, M. Van Bladel and M. Moeneclaey, “BER sensitivity of OFDM systems
to carrier frequency offset and Wiener phase noise,” IEEE Trans. on Comm., Vol.
43, No. 2/3/4, pp. 191-193, Feb.-Apr., 1995.
[10] P. H. Moose, “A technique for orthogonal frequency division multiplexing fre-
quency offset correction,” IEEE Trans. on Comm., Vol. 42, No. 10, pp. 2908-
2914, Oct., 1994.
[11] T. M. Schmidl and D. C. Cox, “Robust frequency and timing synchronization
for OFDM,” IEEE Trans. on Comm., Vol. 45, No. 12, pp. 1613-1621, Dec., 1997.

121

122

[12] Van Nee and R. D. J., “OFDM codes for peak-to-average power reduction and er-
ror correction,” IEEE Global Telecommunications Conference, London, pp. 740-
744, Nov., 1996.

[13] J. A. Davis and J. Jedwab, “Peak-to-average power control in OFDM, Golay
complementary sequences and Reed-Muller codes,” HP Laboratories Technical
Report, HPL-97-158, Dec., 1997.

[14] A. Tarighat and A. H. Sayed, “MIMO OFDM receivers for systems with IQ
imbalances,” IEEE Transactions on Signal Processing, vol. 53, no. 9, pp. 3583-
3596, Sep. 2005.

[15] A. Tarighat, R. Bagheri, and A. H. Sayed, “Compensation schemes and perfor-
mance analysis of IQ imbalances in OFDM receivers,” IEEE Transactions on
Signal Processing, vol. 53, no. 8, pp. 3257-3268, Aug. 2005.

[16] S. Alamouti, “A simple transmit diversity technique for wireless communica-
tions,” IEEE J. Select. Areas Communication, vol. 16, pp. 1451-1458, Oct., 1998.

[17] G. J. Foschini, “Layered space-time architecture for wireless communication in
a fading environment when using multi-element antennas,” Bell Labs. Tech. J.,
pp. 41-59, Autumn, 1996.

[18] V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time block codes from
orthogonal designs,” IEEE Trans. Inform. Theory, vol. 45, pp. 1456-1467, July
1999.

[19] V. Tarokh, N. Seshadri, and A. R. Calderbank, “Space-time codes for high data
rate wireless communications: Performance criterion and code construction,”
IEEE Trans. Inform. Theory, vol. 44, pp. 744-765, March 1998.

[20] T. L. Marzetta and B. M. Hochwald, “Capacity of a mobile multiple-antenna
communication link in Rayleigh ﬂat fading,” IEEE Trans. Inform. Theory, vol.
45, pp. 139-157, Jan. 1999.

[21] G. J. Foschini and M. J. Gans, “On limits of wireless communications in a fading
evvironments when using multiple antennas,” Wireless Pers. Commun., vol. 6,
no. 3, pp. 311-335, Mar. 1998.

[22] E. Telatar, “Capacity of multi-antenna Gaussian channels,” Euro. Trans. Co-
mun., vol. 10, no. 6, pp. 585-595, Nov.-Dec. 1999.

[23] A. Wittneben, “A new bandwidth eﬃcient transmit antenna modulation diver-
sity scheme for linear digital modulation,” Proc. ICC, pp. 1630-1634, 1993.

123

[24] Jan Mietzner and Peter A. Hoeher, “Boosting the performance of wireless com-
munication systems: theory and practice of multiple-antenna techniques,” IEEE
Communicatin Magazine, no. 10, pp. 40-47, Oct. 2004.

[25] T. M. Marzetta and B. M. Hochwald, “Capacity of a mobile multiple-antenna
communication link in Rayleigh flat fading ,” IEEE Trans. Inform. Theory, vol.
45, no. 1, pp. 139-157, 1999.

[26] L. Zheng and D. N. C. Tse, “Communication on the Grassmann manifold: A
geometric approach to the noncoherent multiple-antenna channel ,” IEEE Trans.
Inform. Theory, vol. 48, no. 2, pp. 359-383, Feb. 2002.

[27] I. Barhumi, G. Leus and M. Moonen, “Optimal training design for MIMO OFDM
systems in mobile wireless channels,” IEEE Trans. Signal Processing, vol. 51, No.
6, pp. 1615-1624, Jun. 2003.

[28] Allert van Zelst and Tim C. W. Schenk, “Implementation of a MIMO OFDM-
based Wireless LAN system,” IEEE Trans. Signal Processing, vol. 52, No. 2, pp.
483-494, Feb. 2004.

[29] X. Li, H. Huang G. J. Foschini and R. A. Valenzuela, “Effects of iterative detec-
tion and decoding on the performance of BLAST,” IEEE Proc. Global Telecom-
mun. Conf., vol. 2, No. 2, pp. 1061-1066, 2000.

[30] A. Salvekar, S. Sandhu, Q. Li, M. Vuong and X. Qian, “Multiple-Antenna Tech-
nology in WiMax Systems,” Intel Technology Journal, vol. 8, No. 3, [online]:
http://www.intel.com/technology/itj/2004/volume08issue03, Aug. 2004.

[31] Hongwei Yang, “A road to future broadband wireless access: MIMO-OFDM-
Based air interface,” IEEE Communications Magazine, Vol. 43, No. 1, pp. 53 -
60, Jan. 2005.

[32] H. B¨lcskei, M. Borgmann and A. J. Paulraj, “Impact of the propagation envi-
o
ronments on the performance of space-frequency coded MIMO-OFDM,” IEEE
J. Select. Areas Commun., vol. 21, No. 3, pp. 427-439, Apr. 2003.

[33] H. B¨lcskei, and A. J. Paulraj, “Space-frequency coded broadband OFDM sys-
o
tems,” Proc. IEEE WCNC, pp. 1-6, Chicago, IL, Sep. 2000.

[34] X. Ma, H. Kobayashi and S. C. Schwartz, “Joint frequency offset and chanel
estimation for OFDM,” Proc. of Global Telecommun. Conf., pp. 15-19, Dec.
2003.

[35] P. Stoica and O. Besson, “Training sequence design for frequency offset and
frequency-selective channel estimation,” IEEE Trans. on Commun., vol. 51, No.
11, pp. 1910-1917, Nov. 2003.

124

[36] Nima Khajehnouri and Ali H. Sayed, “Adaptive angle of arrival estimation for
multiuser wireless location systems,” Fifth IEEE Workshop on Signal Processing
Advances in Wireless Communications, Lisboa, Portugal, July 11-14, 2004.

[37] Part 11: Wireless LAN Medium Access Control (MAC) and Pyhsical Layer
(PHY) Specifications—Amendment 1: High-speed Phyisical Layer in the 5 GHz
Band, IEEE Standard 802.11a-1999.

[38] M. Brookers, “Matrix Reference Manual [online]”, available:
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/.

[39] Part 11: Wireless LAN Medium Access Control (MAC) and Pyhsical Layer
(PHY) Specifications—Amendment 1: High-speed Phyisical Layer in the 5 GHz
Band, IEEE Standard 802.11a-1999.

[40] Part 16: Air Interface for Fixed Broadband Wireless Access Systems—
Amendment 2: Medium Access Control Modifications and Additional Pyhsical
Layer Specifications for 2-11 Ghz, IEEE Standard 802.16a-2003.

[41] Digital broadcasting systems for television, sound and data services. European
Telcommunications Standard, prETS 300 744 (Draft, version 0.0.3), Apr. 1996.

[42] H. Sampath, S. Talwar, J. Tellado, V. Erceg and A. Paulraj, “A fourth-generation
MIMO-OFDM broadband wireless system: design, performance and field trial
results,” IEEE Communications Magazine, No. 9, pp. 143-149, Sep., 2002.

[43] Justin Chuang and Nelson Sollenberger, “Beyond 3G: Wideband wireless data
access based on OFDM and dynamic packet assignment,” IEEE Communications
Magazine, No. 7, pp. 78-87, Jul., 2000.

[44] Z. Liu, G. Giannakis, S. Barbarosa, and A. Scaglione, “Transmit-antennae space-
time block coding for generalized OFDM in the presence of unknown multipath,”
IEEE J. Select. Areas Communication, vol. 19, no. 7, pp. 1352-1364, Jul. 2001.

[45] S. Yatawatta and A. P. Petropulu, “Blind channel estimation in MIMO OFDM
systems,” IEEE Trans. Signal Processing, submitted,
http://www.ece.drexel.edu/CSPL/publications/ssp03sa
-rod.pdf

[46] H. B¨lcskei, R. W. Heath Jr. and A. Paulraj, “Blind channel identification and
o
equalization in OFDM-based multiantenna systems,” IEEE Trans. Signal Pro-
cessing, vol. 50, No. 1, pp. 96-109, Jan. 2002.

[47] Y. Li, N. Seshadri and S. Ariyavisitakul, “Channel estimation for OFDM systems
with transmitter diversity in mobile wireless channels,” IEEE J. Select. Areas
Communication, vol. 17, pp. 461-471, March 1999.

125

[48] Y. Li, “Simplified channel estimation for OFDM systems with multiple transmit
antennas,” IEEE Trans. Wireless Communications, vol. 1, No. 1, pp. 67-75, Jan.
2002.

[49] R. Negi and J. Cioffi, “Pilot tone selection for channel estimation in a mobile
OFDM system,” IEEE Trans. Cosumer Electronics, vol. 44, No. 3, pp. 1122-1128,
August 1998.

[50] G. L. St¨ber, J. R. Barry, S. W. Mclaughlin, Y. Li, M. A. Ingram and T. G.
u
Pratt, “Broadband MIMO-OFDM wireless communications,” Proceedings of the
IEEE, vol. 92, No. 2, pp. 271-294, Feb. 2004.

[51] W. C. Jakes, Microwave Mobile Communications, John Wiley and Sons, New
York, 1974.

[52] R. O. Schmidt, “Multiple emitter location and signal parameter estimation”, in
Proc. RADC, Spectral Estimation Workshop, Rome, NY, pp. 243-258.

[53] A. H. Sayed, A. Tarighat, and N. Khajehnouri, “Network-based wireless loca-
tion,” IEEE Signal Processing Magazine, vol. 22, no. 4, pp. 24-40, July 2005.

[54] K. C. Ho and Wenwei Xu, “An accurate algebraic solution for moving source lo-
cation using TDOA and FDOA measurements,” IEEE Trans. Signal Processing,
vol. 52, no. 9, pp. 2453-2463, Sep. 2004.

[55] “Wireless location technologies and service [online],” available:
http://www.3gamericas.org/English/

[56] PELORUS Group. Report on wireless location-based markets. Technical Report,
2001

[57] In-Stat/MDR. Location-based services: Finding their place in the market . Tech-
nical Report, Feb. 2003

[58] A. H. Sayed and N. R. Yousef, Wireless location. Wiley Encyclopedia of Telecom-
munications, J. Proakis, editor, John Wiley & Sons, NY, 2003

[59] FCC Docket No. 94-102. Revision of the commissions rules to issue compatability
with enhanced 911 emergency calling systems. Technical Report RM-8143, July
1996.

[60] State of New Jersey. Report on the New Jersey wireless enhanced 911 terms:
The first 100 days. Technical Report, Jun. 1997

[61] M. Yunos, J. Zeyu Gao and S. Shim, Wireless advertising’s challenges and op-
portunities. IEEE Computer Magazine, vol. 36, No. 5, pp. 30-37, May, 2003

126

[62] Telecommunications Industry Association. The CDMA2000 ITU-R RTT Candi-
date Submission V0.18, Jul. 1998.

[63] J. J. Caﬀery and G. L. Stuber, “Overview of radiolocation in CDMA cellular
systems,” IEEE Communications Magazine, vol. 36, No. 4, pp. 38-45, Apr. 98.

[64] H. Krim and M. Viberg, “Two decades of array signal processing research: Te
parametric approach,” IEEE Signal Processing Magazine, vol. 13, No. 4, pp.
67-94, Jul. 1996.

[65] T. Ojanpera and R. Rrasad, Wideband CDMA for third generation mobile com-
munications. Arech House, Boston, MA 1998.

[66] R. Rrasad, W. Mohr and W. Konhauser, Third generation mobile communica-
tions. Arech House, Boston, MA 2000.

[67] P. Bahl and V. N. Padmanabhan, “Radar: an in-building RF-based user location
and tracking system,” Proc. IEEE Conference INFOCOMM, Vol. 2, pp. 775-784,
Tel Aviv, March 2000.

[68] T. Ross, P. Myllymaki and H. Tirri, “A statistical modeling approach to location
estimation,” IEEE Trans. On Mobile Computing, Vol. 1, No. 1, pp. 59-69, Jan.
2002.

[69] M. Youssef, A. Agrawala and A. U. Shankar, “WLAN location determination via
clustering and probability distributions,” Proc. IEEE Conference PerCom, pp.
143-150, March 2003.

[70] G. H. Golub and C. F. Van Loan, “Matrix Computations”, 2nd Edition, Balti-
more: The Johns Hopkins University Press, 1989.

[71] John G. Proakis, “Digital Communications”, 4th Edition, Prentice Hall, New
Jersey, 2000

[72] Jerry M. Mendel, “Lessons in estimation theory for signal processing, commu-
nications and control,” 2nd Edition, Prentice Hall PTR, Englewood Cliﬀs, New
Jersey, March 1995.

[73] Athanasios Papoulis and S. Unnikrishna Pillai, “Probability , Random Variables
and Stochastic Processes,” 4h Edition, McGraw-Hill, Dec. 2001.

[74] P. Stoica, and R. Moses, “Introduction to Spectral Analysis.” Upper Saddle
River, NJ: Prentice Hall, 1997.

Vita

Zhongshan Wu was born in Anhui, China, on December 4, 1974. He received his

bachelor of science degree in electrical engineering from Northeastern University in

July 1996. In spring 2000, he entered the graduate program in the Department of

Electrical and Computer Engineering at Louisiana State University. He got his master

of science degree in electrical engineering in December 2001. Now he is a candidate

for the degree of doctor of philosophy in electrical engineering.

127

Wu dis

More Related Content

What's hot (17)

Viewers also liked (8)

Similar to Wu dis (20)

Recently uploaded (20)

Wu dis