SlideShare a Scribd company logo
MIMO-OFDM COMMUNICATION SYSTEMS:
 CHANNEL ESTIMATION AND WIRELESS
            LOCATION




                        A Dissertation

           Submitted to the Graduate Faculty of the
               Louisiana State University and
             Agricultural and Mechanical College
                  in partial fulfillment of the
               requirements for the degree of
                     Doctor of Philosophy



                              in



    The Department of Electrical and Computer Engineering




                              by
                        Zhongshan Wu
          B.S., Northeastern University, China, 1996
          M.S., Louisiana State University, US, 2001
                          May 2006
To my parents.




      ii
Acknowledgments

   Throughout my six years at LSU, I have many people to thank for helping to

make my experience here both enriching and rewarding.

   First and foremost, I wish to thank my advisor and committee chair, Dr. Guoxiang

Gu. I am grateful to Dr. Gu for his offering me such an invaluable chance to study

here, for his being a constant source of research ideas, insightful discussions and

inspiring words in times of needs and for his unique attitude of being strict with

academic research which will shape my career forever.

   My heartful appreciation also goes to Dr. Kemin Zhou whose breadth of knowledge

and perspectiveness have instilled in me great interest in bridging theoretical research

and practical implementation. I would like to thank Dr. Shuangqing Wei for his fresh

talks in his seminar and his generous sharing research resource with us.

   I am deeply indebted to Dr. John M. Tyler for his taking his time to serve as my

graduate committee member and his sincere encouragement. For providing me with

the mathematical knowledge and skills imperative to the work in this dissertation, I

would like to thank my minor professor, Dr. Peter Wolenski for his precious time.

   For all my EE friends, Jianqiang He, Bin Fu, Nike Liu, Xiaobo Li, Rachinayani



                                           iii
Kumar Phalguna and Shuguang Hao, I cherish all the wonderful time we have to-

gether.

   Through it all, I owe the greatest debt to my parents and my sisters. Especially

my father, he will be living in my memory for endless time.


Zhongshan Wu

October, 2005




                                        iv
Contents

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     ii

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .              iii

List of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Notation and Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            x

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   xi

1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
  1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                        3
       1.1.1 OFDM System Model . . . . . . . . . . . . . . . . . . . . . .                                                4
  1.2 Dissertation Contributions . . . . . . . . . . . . . . . . . . . . . . . . 24
  1.3 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . 27

2 MIMO-OFDM Channel Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                       28
  2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                 28
  2.2 System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                   32
      2.2.1 Signal Model . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                     33
      2.2.2 Preliminary Analysis . . . . . . . . . . . . . . . . . . . . . . .                                                     40
  2.3 Channel Estimation and Pilot-tone Design . . . . . . . . . . . . . . .                                                       46
      2.3.1 LS Channel Estimation . . . . . . . . . . . . . . . . . . . . . .                                                      46
      2.3.2 Pilot-tone Design . . . . . . . . . . . . . . . . . . . . . . . . .                                                    48
      2.3.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . .                                                     53
  2.4 An Illustrative Example and Concluding
      Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                  54
      2.4.1 Comparison With Known Result . . . . . . . . . . . . . . . .                                                           54
      2.4.2 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . .                                                        59




                                                                  v
3 Wireless Location for OFDM-based Systems . . . . . . . . . . . . . . . . . . . . . .                                               62
  3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                   62
      3.1.1 Overview of WiMax . . . . . . . . . . . . . . . . . . . . . . .                                                          62
      3.1.2 Overview to Wireless Location System . . . . . . . . . . . . .                                                           65
      3.1.3 Review of Data Fusion Methods . . . . . . . . . . . . . . . . .                                                          70
  3.2 Least-square Location based on TDOA/AOA Estimates . . . . . . . .                                                              78
      3.2.1 Mathematical Preparations . . . . . . . . . . . . . . . . . . .                                                          78
      3.2.2 Location based on TDOA . . . . . . . . . . . . . . . . . . . .                                                           83
      3.2.3 Location based on AOA . . . . . . . . . . . . . . . . . . . . .                                                          94
      3.2.4 Location based on both TDOA and AOA . . . . . . . . . . . .                                                             100
  3.3 Constrained Least-square Optimization . . . . . . . . . . . . . . . . .                                                       105
  3.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                   110
  3.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                       114

4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127




                                                                   vi
List of Figures

 1.1   Comparison between conventional FDM and OFDM . . . . . . . . . .                7

 1.2   Graphical interpretation of OFDM concept . . . . . . . . . . . . . . .          9

 1.3   Spectra of (a) an OFDM subchannel (b) an OFDM symbol . . . . . .               10

 1.4   Preliminary concept of DFT . . . . . . . . . . . . . . . . . . . . . . .       11

 1.5   Block diagram of a baseband OFDM transceiver . . . . . . . . . . . .           13

 1.6   (a) Concept of CP; (b) OFDM symbol with cyclic extension . . . . .             16

 2.1   Nt × Nr MIMO-OFDM System model . . . . . . . . . . . . . . . . .               34

 2.2   The concept of pilot-based channel estimation . . . . . . . . . . . . .        43

 2.3   Pilot placement with Nt = Nr = 2 . . . . . . . . . . . . . . . . . . . .       52

 2.4   Symbol error rate versus SNR with Doppler shift=5 Hz . . . . . . . .           56

 2.5   Symbol error rate versus SNR with Doppler shift=40 Hz . . . . . . .            57

 2.6   Symbol error rate versus SNR with Doppler shift=200 Hz . . . . . . .           57

 2.7   Normalized MSE of channel estimation based on optimal pilot-tone

       design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   58

 2.8   Normalized MSE of channel estimation based on preamble design . .              58

 3.1   Network-based wireless location technology (outdoor environments) .            67



                                          vii
3.2   TOA/TDOA data fusion using three BSs . . . . . . . . . . . . . . . .      70

3.3   AOA data fusion with two BSs . . . . . . . . . . . . . . . . . . . . .    74

3.4   Magnitude-based data fusion in WLAN networks . . . . . . . . . . .        77

3.5   Base stations and mobile user locations . . . . . . . . . . . . . . . . . 110

3.6   Location estimation with TDOA-only and AOA+TDOA data . . . . 112

3.7   Location estimation performance . . . . . . . . . . . . . . . . . . . . 113

3.8   Effect of SNR on estimation accuracy . . . . . . . . . . . . . . . . . . 113

3.9   Outrage curve for location accuracy . . . . . . . . . . . . . . . . . . . 114




                                      viii
Notation and Symbols

AM×N : M-row N-column matrix
A−1 : Inverse of A
Tr(A): Trace of A, Tr(A) =     i   Aii
AT : Transpose of A
A∗ : Complex conjugate transpose of A
IN : Identity matrix of size N × N




                                         ix
List of Acronyms

MIMO    multiple input and multiple outut
OFDM    orthogonal frequency division multiplexing
LS      least square
MS      mobile station
TDOA    time difference of arrival
AOA     angle of arrival
WiMax   worldwide interoperability for microwave access
ML      maximum-likelihood
AWGN    additive white Gaussian noise
WMAN    wireless metropolitan area network
ICI     inter-carrier interference
ISI     inter-symbol interference
FFT     fast Fourier transform
WLAN    wireless local area network
CP      cyclic prefix
BER     bit error rate
MMSE    minimum mean squared error
GPS     global positioning system
WiFi    wireless fidelity




                                      x
Abstract

In this new information age, high data rate and strong reliability features our wire-

less communication systems and is becoming the dominant factor for a successful

deployment of commercial networks. MIMO-OFDM (multiple input multiple output-

orthogonal frequency division multiplexing), a new wireless broadband technology,

has gained great popularity for its capability of high rate transmission and its robust-

ness against multi-path fading and other channel impairments.

   A major challenge to MIMO-OFDM systems is how to obtain the channel state in-

formation accurately and promptly for coherent detection of information symbols and

channel synchronization. In the first part, this dissertation formulates the channel

estimation problem for MIMO-OFDM systems and proposes a pilot-tone based esti-

mation algorithm. A complex equivalent baseband MIMO-OFDM signal model is pre-

sented by matrix representation. By choosing L equally-spaced and equally-powered

pilot tones from N sub-carriers in one OFDM symbol, a down-sampled version of

the original signal model is obtained. Furthermore, this signal model is transformed

into a linear form solvable for the LS (least-square) estimation algorithm. Based on

the resultant model, a simple pilot-tone design is proposed in the form of a unitary



                                           xi
matrix, whose rows stand for different pilot-tone sets in the frequency domain and

whose columns represent distinct transmit antennas in the spatial domain. From the

analysis and synthesis of the pilot-tone design in this dissertation, our estimation

algorithm can reduce the computational complexity inherited in MIMO systems by

the fact that the pilot-tone matrix is essentially a unitary matrix, and is proven an

optimal channel estimator in the sense of achieving the minimum MSE (mean squared

error) of channel estimation for a fixed power of pilot tones.

   In the second part, this dissertation addresses the wireless location problem in

WiMax (worldwide interoperability for microwave access) networks, which is mainly

based on the MIMO-OFDM technology. From the measurement data of TDOA (time

difference of arrival), AOA (angle of arrival) or a combination of those two, a quasi-

linear form is formulated for an LS-type solution. It is assumed that the observation

data is corrupted by a zero-mean AWGN (additive white Gaussian noise) with a very

small variance. Under this assumption, the noise term in the quasi-liner form is proved

to hold a normal distribution approximately. Hence the ML (maximum-likelihood)

estimation and the LS-type solution are equivalent. But the ML estimation technique

is not feasible here due to its computational complexity and the possible nonexistence

of the optimal solution. Our proposed method is capable of estimating the MS loca-

tion very accurately with a much less amount of computations. A final result of the

MS (mobile station) location estimation, however, cannot be obtained directly from

the LS-type solution without bringing in another independent constraint. To solve



                                          xii
this problem, the Lagrange multiplier is explored to find the optimal solution to the

constrained LS-type optimization problem.




                                        xiii
Chapter 1

Introduction

Wireless technologies have evolved remarkably since Guglielmo Marconi first demon-

strated radio’s ability to provide continuous contact with ships sailing in the English

channel in 1897. New theories and applications of wireless technologies have been

developed by hundreds and thousands of scientists and engineers through the world

ever since. Wireless communications can be regarded as the most important devel-

opment that has an extremely wide range of applications from TV remote control

and cordless phones to cellular phones and satellite-based TV systems. It changed

people’s life style in every aspect. Especially during the last decade, the mobile radio

communications industry has grown by an exponentially increasing rate, fueled by

the digital and RF (radio frequency) circuits design, fabrication and integration tech-

niques and more computing power in chips. This trend will continue with an even

greater pace in the near future.

   The advances and developments in the technique field have partially helped to

realize our dreams on fast and reliable communicating “any time any where”. But we



                                           1
2


are expecting to have more experience in this wireless world such as wireless Internet

surfing and interactive multimedia messaging so on. One natural question is: how

can we put high-rate data streams over radio links to satisfy our needs? New wireless

broadband access techniques are anticipated to answer this question. For example,

the coming 3G (third generation) cellular technology can provide us with up to 2Mbps

(bits per second) data service. But that still does not meet the data rate required by

multimedia media communications like HDTV (high-definition television) and video

conference. Recently MIMO-OFDM systems have gained considerable attentions from

the leading industry companies and the active academic community [28, 30, 42, 50].

A collection of problems including channel measurements and modeling, channel es-

timation, synchronization, IQ (in phase-quadrature)imbalance and PAPR (peak-to-

average power ratio) have been widely studied by researchers [48, 11, 14, 15, 13].

Clearly all the performance improvement and capacity increase are based on accurate

channel state information. Channel estimation plays a significant role for MIMO-

OFDM systems. For this reason, it is the first part of my dissertation to work on

channel estimation of MIMO-OFDM systems.

   The maturing of MIMO-OFDM technology will lead it to a much wider variety of

applications. WMAN (wireless metropolitan area network) has adopted this technol-

ogy. Similar to current network-based wireless location technique [53], we consider the

wireless location problem on the WiMax network, which is based on MIMO-OFDM

technology. The work in this area contributes to the second part of my dissertation.
3


1.1     Overview

OFDM [5] is becoming a very popular multi-carrier modulation technique for trans-

mission of signals over wireless channels. It converts a frequency-selective fading

channel into a collection of parallel flat fading subchannels, which greatly simpli-

fies the structure of the receiver. The time domain waveform of the subcarriers are

orthogonal (subchannel and subcarrier will be used interchangeably hereinafter), yet

the signal spectral corresponding to different subcarriers overlap in frequency domain.

Hence, the available bandwidth is utilized very efficiently in OFDM systems without

causing the ICI (inter-carrier interference). By combining multiple low-data-rate sub-

carriers, OFDM systems can provide a composite high-data-rate with a long symbol

duration. That helps to eliminate the ISI (inter-symbol interference), which often

occurs along with signals of a short symbol duration in a multipath channel. Simply

speaking, we can list its pros and cons as follows [31].

   Advantage of OFDM systems are:


   • High spectral efficiency;


   • Simple implementation by FFT (fast Fourier transform);


   • Low receiver complexity;


   • Robustability for high-data-rate transmission over multipath fading channel


   • High flexibility in terms of link adaptation;
4


   • Low complexity multiple access schemes such as orthogonal frequency division

     multiple access.


   Disadvantages of OFDM systems are:


   • Sensitive to frequency offsets, timing errors and phase noise;


   • Relatively higher peak-to-average power ratio compared to single carrier system,

     which tends to reduce the power efficiency of the RF amplifier.

1.1.1     OFDM System Model

The OFDM technology is widely used in two types of working environments, i.e.,

a wired environment and a wireless environment. When used to transmit signals

through wires like twisted wire pairs and coaxial cables, it is usually called as DMT

(digital multi-tone). For instance, DMT is the core technology for all the xDSL

(digital subscriber lines) systems which provide high-speed data service via existing

telephone networks. However, in a wireless environment such as radio broadcasting

system and WLAN (wireless local area network), it is referred to as OFDM. Since we

aim at performance enhancement for wireless communication systems, we use the term

OFDM throughout this thesis. Furthermore, we only use the term MIMO-OFDM

while explicitly addressing the OFDM systems combined with multiple antennas at

both ends of a wireless link.

   The history of OFDM can all the way date back to the mid 1960s, when Chang [2]

published a paper on the synthesis of bandlimited orthogonal signals for multichannel
5


data transmission. He presented a new principle of transmitting signals simultane-

ously over a bandlimited channel without the ICI and the ISI. Right after Chang’s

publication of his paper, Saltzburg [3] demonstrated the performance of the efficient

parallel data transmission systems in 1967, where he concluded that “the strategy

of designing an efficient parallel system should concentrate on reducing crosstalk be-

tween adjacent channels than on perfecting the individual channels themselves”. His

conclusion has been proven far-sighted today in the digital baseband signal processing

to battle the ICI.

   Through the developments of OFDM technology, there are two remarkable con-

tributions to OFDM which transform the original “analog” multicarrier system to to-

day’s digitally implemented OFDM. The use of DFT (discrete Fourier transform) to

perform baseband modulation and demodulation was the first milestone when Wein-

stein and Ebert [4] published their paper in 1971. Their method eliminated the banks

of subcarrier oscillators and coherent demodulators required by frequency-division

multiplexing and hence reduced the cost of OFDM systems. Moreover, DFT-based

frequency-division multiplexing can be completely implemented in digital baseband,

not by bandpass filtering, for highly efficient processing. FFT, a fast algorithm for

computing DFT, can further reduce the number of arithmetic operations from N 2

to N logN (N is FFT size). Recent advances in VLSI (very large scale integration)

technology has made high-speed, large-size FFT chips commercially available. In We-

instein’s paper [4], they used a guard interval between consecutive symbols and the
6


raised-cosine windowing in the time-domain to combat the ISI and the ICI. But their

system could not keep perfect orthogonality between subcarriers over a time disper-

sive channel. This problem was first tackled by Peled and Ruiz [6] in 1980 with the

introduction of CP (cyclic prefix) or cyclic extension. They creatively filled the empty

guard interval with a cyclic extension of the OFDM symbol. If the length of CP is

longer than the impulse response of the channel, the ISI can be eliminated completely.

Furthermore, this effectively simulates a channel performing cyclic convolution which

implies orthogonality between subcarriers over a time dispersive channel. Though

this introduces an energy loss proportional to the length of CP when the CP part

in the received signal is removed, the zero ICI generally pays the loss. And it is the

second major contribution to OFDM systems.

   With OFDM systems getting more popular applications, the requirements for a

better performance is becoming higher. Hence more research efforts are poured into

the investigation of OFDM systems. Pulse shaping [7, 8], at an interference point

view, is beneficial for OFDM systems since the spectrum of an OFDM signal can

be shaped to be more well-localized in frequency; Synchronization [9, 10, 11] in time

domain and in frequency domain renders OFDM systems robust against timing errors,

phase noise, sampling frequency errors and carrier frequency offsets; For coherent

detection, channel estimation [46, 49, 48] provides accurate channel state information

to enhance performance of OFDM systems; Various effective techniques are exploited

to reduce the relatively high PAPR [12, 13] such as clipping and peak windowing.
7


        The principle of OFDM is to divide a single high-data-rate stream into a number of

lower rate streams that are transmitted simultaneously over some narrower subchan-

nels. Hence it is not only a modulation (frequency modulation) technique, but also

a multiplexing (frequency-division multiplexing) technique. Before we mathemati-

cally describe the transmitter-channel-receiver structure of OFDM systems, a couple

of graphical intuitions will make it much easier to understand how OFDM works.

OFDM starts with the “O”, i.e., orthogonal. That orthogonality differs OFDM from

conventional FDM (frequency-division multiplexing) and is the source where all the

advantages of OFDM come from. The difference between OFDM and conventional

FDM is illustrated in Figure 1.1.

             Ch1           Ch2               Ch3        Ch4              Ch5
Power




                                             (a)                               Frequency



             Ch1   Ch2   Ch3     Ch4   Ch5
                                                   Saving of bandwidth
Power




                                             (b)                               Frequency

              Figure 1.1: Comparison between conventional FDM and OFDM


        It can be seen from Figure 1.1, in order to implement the conventional parallel

data transmission by FDM, a guard band must be introduced between the different
8


carriers to eliminate the interchannel interference. This leads to an inefficient use

of the rare and expensive spectrum resource. Hence it stimulated the searching for

an FDM scheme with overlapping multicarrier modulation in the mid of 1960s. To

realize the overlapping multicarrier technique, however we need to get rid of the ICI,

which means that we need perfect orthogonality between the different modulated

carriers. The word “orthogonality” implies that there is a precise mathematical re-

lationship between the frequencies of the individual subcarriers in the system. In

OFDM systems, assume that the OFDM symbol period is Tsym , then the minimum

subcarrier spacing is 1/Tsym . By this strict mathematical constraint, the integration

of the product of the received signal and any one of the subcarriers fsub over one

symbol period Tsym will extract that subcarrier fsub only, because the integration of

the product of fsub and any other subcarriers over Tsym results zero. That indicates

no ICI in the OFDM system while achieving almost 50% bandwidth savings. In the

sense of multiplexing, we refer to Figure 1.2 to illustrate the concept of OFDM. Ev-

ery Tsym seconds, a total of N complex-valued numbers Sk from different QAM/PSK

(quadrature and amplitude modulation/phase shift keying) constellation points are

used to modulate N different complex carriers centered at frequency fk , 1 ≤ k ≤ N .

The composite signal is obtained by summing up all the N modulated carriers.

   It is worth noting that OFDM achieves frequency-division multiplexing by base-

band processing rather than by bandpass filtering. Indeed, as shown in Figure 1.3,

the individual spectra has sinc shape. Even though they are not bandlimited, each
9




                                                                           j 2 f 1t
                                                                       e
   s1(t)   S1




                                                                           j 2 f 2t
                                                                       e
  s2(t)    S2




                                                                       e j2    fNt




  sN(t) SN



OFDM symbol:




                Figure 1.2: Graphical interpretation of OFDM concept
10


subcarrier can still be separated from the others since orthogonality guarantees that

the interfering sincs have nulls at the frequency where the sinc of interest has a peak.
           1                                                     1


         0.8                                                   0.8


         0.6                                                   0.6


         0.4                                                   0.4


         0.2                                                   0.2


           0                                                     0


         -0.2                                                 -0.2


         -0.4                                                  -0.4
            -10   -8    -6    -4   -2    0    2   4   6   8   10 -10        -5      0         5    10
                                        (a)                                             (b)

        Figure 1.3: Spectra of (a) an OFDM subchannel (b) an OFDM symbol


    The use of IDFT (inverse discrete Fourier transform), instead of local oscillators,

was an important breakthrough in the history of OFDM. It is an imperative part for

OFDM system today. It transforms the data from frequency domain to time domain.

Figure 1.4 shows the preliminary concept of DFT used in an OFDM system. When

the DFT of a time domain signal is computed, the frequency domain results are a

function of the sampling period T and the number of sample points N . The funda-

                                                                        1
mental frequency of the DFT is equal to                                NT
                                                                            (1/total sample time). Each frequency

represented in the DFT is an integer multiple of the fundamental frequency. The

maximum frequency that can be represented by a time domain signal sampled at rate

1                       1
T
    is fmax =          2T
                             as given by the Nyquist sampling theorem. This frequency is located

in the center of the DFT points. The IDFT performs exactly the opposite operation

to the DFT. It takes a signal defined by frequency components and converts them to

a time domain signal. The time duration of the IDFT time signal is equal to N T . In
11


essence, IDFT and DFT is a reversable pair. It is not necessary to require that IDFT

be used in the transmitter side. It is perfectly valid to use DFT at transmitter and

then to use IDFT at receiver side.
        s(t)




                            T                                                        t
                        sample period
                                              NT




         S(f)




        0 1/NT   2/NT                         2/T                     (N-1)/NT   f



                             Figure 1.4: Preliminary concept of DFT


   After the graphical description of the basic principles of OFDM such as orthogo-

nality, frequency modulation and multiplexing and use of DFT in baseband process-

ing, it is a time to look in more details at the signals flowing between the blocks of

an OFDM system and their mathematical relations. At this point, we employ the

following assumptions for the OFDM system we consider.


   • a CP is used;


   • the channel impulse response is shorter than the CP, in terms of their respective

     length;
12


   • there is perfect synchronization between the transmitter and the receiver;


   • channel nosise is additive, white and complex Gaussian;


   • the fading is slowing enough for the channel to be considered constant during

     the transmission of one OFDM symbol.


For a tractable analysis of OFDM systems, we take a common practice to use the

simplified mathematical model. Though the first OFDM system was implemented by

analogue technology, here we choose to investigate a discrete-time model of OFDM

step by step since digital baseband synthesis is widely exploited for today’s OFDM

systems. Figure 1.5 shows a block diagram of a baseband OFDM modem which is

based on PHY (physical layer) of IEEE standard 802.11a [37].

   Before describing the mathematical model, we define the symbols and notations

used in this dissertation. Capital and lower-case letters denote signals in frequency

domain and in time domain respectively. Arrow bar indicates a vector and boldface

letter without an arrow bar represents a matrix. It is packed into a table as follows.
Binary
                                                            input                                                                                                 u (m)
                                                            data Channel        Inter-     QAM        Pilot            S (m)               s (m)
                                                                                                                                                   P/S     Add
                                                                  coding       leaving    mapping   insertion    S/P                                                      DAC   RF TX
                                                                                                                                                           CP

                                                                                                                                                                                        channel
                                                                                                                               IFFT (TX)

                                                                                                                               FFT (RX)
                                                                                                                                             y (m)                r (m)
                                                                               Deinter-     De                                                           Remove
                                                                    Decoding                        Detection    P/S                               S/P                    ADC   RF RX
                                                                               leaving    mapping                       Y (m)                              CP
                                                           Binary
                                                           output
                                                            data

                                                                                                     Channel                                             Timing and
                                                                                                    estimation                                             Synch.




Figure 1.5: Block diagram of a baseband OFDM transceiver
                                                                                                                                                                                                  13
14


            Ap×q     p × q matrix
              a      column vector
              Ip     p × p identity matrix
              0      zero matrix
           diag(a)   diagonal matrix with a’s elements on the diagonal
                                          ¯
             AT      transpose of A
             A∗      complex conjugate of A
             AH      Hermitian of A
            tr(A)    trace of A
           rank(A)   rank of A
           det(A)    determinant of A
            A⊗B      Kronecker product of A and B

   As shown in Figure 1.5, the input serial binary data will be processed by a data

scrambler first and then channel coding is applied to the input data to improve the

BER (bit error rate) performance of the system. The encoded data stream is fur-

ther interleaved to reduce the burst symbol error rate. Dependent on the channel

condition like fading, different base modulation modes such as BPSK (binary phase

shift keying), QPSK (quadrature phase shift keying) and QAM are adaptively used

to boost the data rate. The modulation mode can be changed even during the trans-

mission of data frames. The resulting complex numbers are grouped into column

vectors which have the same number of elements as the FFT size, N . For simplicity

of presentation and ease of understanding, we choose to use matrix and vector to

describe the mathematical model. Let S(m) represent the m-th OFDM symbol in
15


the frequency domain, i.e.,
                                                        
                                           S(mN )       
                                             .          
                          S(m) =             .
                                              .                 ,
                                                        
                                                        
                                         S(mN + N − 1)
                                                          N ×1

where m is the index of OFDM symbols. We assume that the complex-valued elements

{S(mN ), S(mN + 1), . . . , S(mN + N − 1)} of S(m) are zero mean and uncorrelated

random variables whose sample space is the signal constellation of the base modula-

tion (BPSK, QPSK and QAM). To achieve the same average power for all different

mappings, a normalization factor KMOD [37] is multiplied to each elements of S(m)

such that the average power of the different mappings is normalized to unity. To

obtain the time domain samples, as shown by the IDFT block in Figure 1.5, an IFFT

(inverse fast Fourier transform) operation is represented by a matrix multiplication.
                                                                     2π
Let FN be the N -point DFT matrix whose (p, q)-th elements is e−j N (p−1)(q−1) . The

resulting time domain samples s(m) can be described by
                                                            
                                            s(mN )          
                                                            
                                               .
                                                .            
                         s(m) =      
                                               .            
                                                             
                                                                              (1.1)
                                          s(mN + N − 1)
                                                             N ×1
                                       1
                                 =   ( N )FH S(m).
                                           N


Compared to the costly and complicated modulation and multiplexing of conventional

FDM systems, OFDM systems easily implement them by using FFT in baseband pro-

cessing. To combat the multipath delay spread in wireless channels, the time-domain

samples s(m) is cyclically extended by copying the last Ng samples and pasting them

to the front, as shown in Figure 1.6(a) [6].
16




                                  N
                                          Ng

            CP




                                                                 guard time       FFT integration time
                                                                   (CP)
                        (a)                                                         (b)



        Figure 1.6: (a) Concept of CP; (b) OFDM symbol with cyclic extension

   Let u(m) denote the cyclically extended OFDM symbol as
                                                            
                                       u(mNtot )                     
                                                            
                                                            
                                         .
                                          .                    CP 
                 u(m) =       
                                         .                  =
                                                             
                                                                                            ,
                                                               s(m)
                                  u(mNtot + Ntot − 1)                              Ntot ×1


where Ntot = N + Ng is the length of u(m). In the form of matrix, the CP insertion

can be readily expressed as a matrix product of s(m) and an Ntot × N matrix ACP .

By straight computation, it holds that


                                      u(m) = ACP s(m),                                                   (1.2)


where                                                   
                                   0              INg 
                                                    
                          ACP =  IN −Ng
                                
                                                     
                                                   0                         .
                                                    
                                                    
                                               0   INg
                                                         (N +Ng )×N

One of the challenges from the harsh wireless channels is the multipath delay spread.

If the delay spread is relatively large compared to the symbol duration, then a delayed

copy of a previous symbol will overlap the current one which implies severe ISI. To
17


eliminate the ISI almost completely, a CP is introduced for each OFDM symbol and

the length of CP, Ng must be chosen longer than the experienced delay spread, L, i.e.,

Ng ≥ L. In addition, CP is capable of maintaining the orthogonality among subcarri-

ers which implies zero ICI. It is because the OFDM symbol is cyclically extended and

this ensures that the delayed replicas of the OFDM symbol always have an integer

number of cycles within the FFT interval, as long as the delay is smaller than the CP.

It is clearly illustrated in Figure 1.6(b). No matter where the FFT window starts,

provided that it is within the CP, there will be always one or two complete cycles

within FFT integration time for the symbol on top and at below respectively. In IEEE

802.11a standard [37], Ng is at least 16. The obtained OFDM symbol (including the

CP) u(m), as shown in Figure 1.5, must be converted to the analogue domain by an

DAC (digital-to-analog converter) and then up-converted for RF transmission since it

is currently not practical to generate the OFDM symbol directly at RF rates. To re-

main in the discrete-time domain, the OFDM symbol could be up-sampled and added

to a discrete carrier frequency. This carrier could be an IF (intermediate frequency)

whose sample rate is handled by current technology. It could then be converted to

analog and increased to the final transmit frequency using analog frequency conver-

sion methods. Alternatively, the OFDM modulation could be immediately converted

to analog and directly increased to the desired RF transmit frequency. Either way has

its advantages and disadvantages. Cost, power consumption and complexity must be

taken into consideration for the selected technique.
18


    The RF signal is transmitted over the air. For the wireless channel, it is assumed

in this thesis as a quasi-static frequency-selective Rayleigh fading channel [71]. It

indicates that the channel remains constant during the transmission of one OFDM

symbol. Suppose that the multipath channel can be modeled by a discrete-time

baseband equivalent (L−1)th-order FIR (finite impulse response) filter with filter taps

{h0 , h1 , . . . , hl , . . . , hL−1 }. It is further assumed that the channel impulse response,i.e.,

the equivalent FIR filter taps, are independent zero mean complex Gaussian random

variables with variance of 1 Pl per dimension. The ensemble of {P0 , . . . , Pl , . . . , PL−1 }
                           2


is the PDP (power delay profile) of the channel and usually the total power of the

PDP is normalized to be 1 as the unit average channel attenuation. Denote the CIR

(channel impulse response) vector hm as
                                                         
                                                h0,m     
                                                         
                                                .
                                                 .        
                                     hm = 
                                                .        
                                                          
                                                                ,
                                                         
                                                 hL−1,m
                                                          L×1

where the subscript m is kept to imply that the channel may vary from one OFDM

symbol to the next one. Then the complex baseband equivalent received signal can

be represented by a discrete-time convolution as
                                   L−1
                r(mNtot + n) =            hl,m u(mNtot + n − l) + v(mNtot + n),                (1.3)
                                    l=0

where mNtot + n means the n-th received sample during the m-th OFDM symbol

and 0 ≤ n ≤ Ntot − 1. The term v(mNtot + n) represents the complex AWGN at

                                                              1 2
the (mNtot + n)-th time sample with zero mean and variance of 2 σv per dimension.

                                                                                              1
Hence, the expected SNR (signal-to-noise ratio) per received signal is ρ =                   σv2.   In
19


order for the parallel processing by the DFT block in Figure 1.5, we will rewrite the

equation (1.3) into a matrix form. First we define


                                                                                                 
                  r(mNtot )                                            v(mNtot )                 
                                                                                                 
                  .
                   .                                                    .
                                                                          .                         
      r(m) = 
                  .                                 ;
                                                     
                                                            v(m) = 
                                                                         .                         ,
                                                                                                    
                                                                                                                 (1.4)
                                                                                                 
                   r(mNtot + Ntot − 1)                                    v(mNtot + Ntot − 1)

   and
                                                                                                         
               h0,m                                                              hL−1,m   ···       h1,m
                .        ..                                                                        .     
                .
                 .             .                                                        ..          .     
                                                                                            .      .     
                                                           
                                                             ; hm,T oep =                                 .
                                                                 (c)
  hm,T oep =  hL−1,m     ···       h0,m                                                          hL−1,m        (1.5)
                                                                                                         
                         ..          .
                                      .     ..                                                            
                              .      .          .          
                                   hL−1,m   ···      h0,m


Then it is straight forward to have the following input-output relationship with regard

to the channel

                                                                (c)
                        r(m) = hm,T oep u(m) + hm,T oep u(m − 1) + v(m).                                         (1.6)


It is easy to see in (1.6) that the first L−1 terms of r(m), i.e., {r(mNtot ), . . . , r(mNtot +
                                                             (c)
L − 2)}, will be affected by the ISI term hm,T oep u(m − 1) since the Toeplitz and upper
                         (c)
triangular matrix hm,T oep has non-zero entries in the first L − 1 rows. In order to

remove the ISI term, we transform the Ntot × 1 vector r(m) into an N × 1 vector

y(m) by simply cutting off the first Ng possibly ISI affected elements. For complete

elimination of ISI, Ng ≥ L must be satisfied. It is a reverse operation of the cyclic

extension as implemented in the transmitter side. Consistently this transformation
20


can also be expresses as matrix-vector product
                                                            
                                       y(mN )               
                                                            
                                        .
                     y(m) = 
                                       .
                                        .
                                                             
                                                              = ADeCP r(m) ,                         (1.7)
                                                            
                                                            
                                        y(mN + N − 1)
where

                                    ADeCP =           0 IN             .
                                                             N ×Ntot

As shown in Figure 1.5, the ISI-free received signal y(m) is demodulated by FFT

and hence it is converted back to the frequency domain received signal Y (m). It is

described by                                                    
                                           Y (mN )              
                                                                
                                           .
                                            .                    
                     Y (m) = 
                                           .                     = FN y(m) .
                                                                 
                                                                                                      (1.8)
                                                                
                                            Y (mN + N − 1)
After obtaining the received signal Y (m), symbol detection can be implemented if the

channel state information is known or it can be estimated by some channel estimation

algorithms. The detected symbol will pass through a series of reverse operations to

retrieve the input binary information, corresponding to the encoding, interleaving

and mapping in the transmitter side. Following the signal flow from the transmitted

signal S(m) to the receive signal Y (m), a simple relationship between them can be

expressed as

                            Y (m) = Hm,diag S(m) + V (m),                                             (1.9)

where the diagonal matrix Hm,diag is
                                                 
                   H0,m                          
                                                 
                  
        Hm,diag =         ..                     
                                                   ; Hk,m =         L−1         2π
                                                                           hl e−j N kl , 0 ≤ k ≤ N.
                  
                                .                                   l=0
                                                 
                                        HN −1,m
21


and V (m) is the complex AWGN in frequency domain. This simple transmitter-

and-receiver structure is well known in all the literatures [42, 46, 48, 49] and it is

an important reason for the wide application of OFDM systems. The transmitted

signal can be easily extracted by simply dividing the channel frequency response for

the specific subcarrier. Hence it eliminates the needs of a complicated equalizer at

the receive side. In this thesis, we do not directly jump on this known conclusion

for two reasons. First, following through the baseband block diagram in Figure 1.5,

we use a matrix form of presentation to describe all the input-output relationship

with respect to each block. This gives us a clear and thorough understanding of all

the signal processing within the OFDM system. It is a different view from those in

literatures which can be summarized by the fact that the discrete Fourier transform

of a cyclic convolution (IDFT(S(m)) and hm ) in time domain leads to a product of

the frequency responses (S(m) and DFT(hm )) of the two convoluted terms. Second,

this provides a base for our channel estimator design in the following chapter. Next,

the simple relation in (1.9) is shown by going through the signal flow backwards from
22


Y (m) to S(m) that

       Y (m) = FN y(m)
                  = FN (ADeCP r(m))
                                                          (c)
                  = FN {ADeCP [hm,T oep u(m) + hm,T oep u(m − 1) + v(m)]}
                  = FN [ADeCP hm,T oep u(m) + ADeCP v(m)]
                                                                                          ,         (1.10)
                  = FN [ADeCP hm,T oep ACP s(m) + ADeCP v(m)]
                                             1
                  = FN [ADeCP hm,T oep ACP ( N )FH S(m) + ADeCP v(m)]
                                                 N
                      1
                  =     F [ADeCP hm,T oep ACP ]FH S(m)
                      N N                       N                     + FN (ADeCP v(m))
                      1
                  =   N
                        [FN hCir FH ]S(m)
                                  N             + V (m)

where V (m) = FN (ADeCP v(m)) and hCir = ADeCP hm,T oep ACP is an N × N circulant
matrix with some special properties. It is parameterized as
                                                                                               
          h0,m          0        ··· ···         0             hL−1,m hL−2,m · · · h1,m        
                                                                                               
             h1,m       h0,m     0      ···      0             0         hL−1,m   · · · h2,m   
                                                                                               
                                                                                               
             .
              .          .
                         .        ..     .        .             .         .        .. .         
          
             .          .             . .
                                         .        .
                                                  .             .
                                                                .         .
                                                                          .              .
                                                                                       . .      
                                                                                                
                                                                                               
             hL−2,m · · ·        · · · h0,m 0                  ···       ···      0     hL−1,m 
                                                                                               
                                                                                               
                                                                                               
 hm,Cir =    hL−1,m · · ·        ··· ···         h0,m          0         ···      ··· 0               .
                                                                                               
                                                                                               
             0          hL−1,m · · ·    ···      ···           h0,m      0        ··· 0        
                                                                                               
             .                          .        .             .                  .     .      
             .          ..     ..       .        .             .         ..       .     .      
             .             .       .    .        .             .              .   .     .      
                                                                                               
             .          ..     ..       ..       .             .         ..       .. .  .      
             .
              .             .       .       .     .
                                                  .             .
                                                                .              .       . .      
                                                                                               
                                                                                               
              0          ···      ··· 0           hL−1,m hL−2,m · · ·              · · · h0,m
                                                                                                N ×N
                                                                                                    (1.11)

As stated in [38], an N × N circulant matrix has some important properties:


   • All the N × N circulant matrices have the same eigenvectors and they are the

                 H
     columns of FN , where FN is the N -point FFT matrix;


   • The corresponding eigenvalues {λ1 , · · · , λN } are the FFT of the first column of

     the circulant matrix;
23


The first column of the circulant matrix hm,Cir is [hT , . . . , hT
                                                    0,m
                                                                                      T
                                                                 L−1,m , 0, . . . , 0] . Hence,


the eigenvalues of hm,Cir is
                                                                  
                              H0,m                    h0,m        
                                                      .           
                             
                              H1,m
                                           
                                           
                                                  
                                                        .
                                                         .           
                                                                     
                                           = FN                   .
                              .                                   
                              .
                               .                  h                
                                                 L−1,m            
                                                                  
                                 HN −1,m                 0(N −L)×1

Taking eigenvalue decomposition of hm,Cir , we have
                                                                        
                                            H0,m                        
                                  1                                     
                       hm,Cir    = FH              ..                   
                                                                          FN .         (1.12)
                                                        .
                                  N N
                                                                         
                                                                         
                                                             HN −1,m

Simply substituting (1.12) into (1.10) shows that (1.9) is true.

The simple model in (1.9) is widely exploited for theoretical research. It is, however,

based on all of the assumptions we make at the beginning of this section. In the

practical OFDM systems, a lot of efforts were made in research to keep the OFDM

systems as close to this model as possible. Perfect synchronization in time domain

and frequency domain is the most challenging subject. The orthogonality could be

easily destroyed by a few factors such as the Doppler shift resulting from the relative

movement between the transmitter and the receiver, the frequency mismatch between

the oscillators at two ends, large timing errors and phase noise. Meanwhile, accurate

channel state information is critical for reducing the BER and improving the system

performance. Hence, joint channel estimation and synchronization with low complex-

ity is an active research area for current OFDM systems. As long as the orthogonality

is obtained, OFDM is a simple and efficient multicarrier data transmission technique.
24


1.2      Dissertation Contributions

In the first part, this dissertation addresses one of the most fundamental problems in

MIMO-OFDM communication system design, i.e., the fast and reliable channel esti-

mation. By using the pilot symbols, a MIMO-OFDM channel estimator is proposed

in this dissertation which is capable of estimating the time-dispersive and frequency-

selective fading channel. Our contribution to this dissertation are as follows.


   • Great Simplicity:

      For an Nt ×Nr MIMO (Nt : number of transmit antennas,Nr : number of receive

      antennas) system, the complexity of any kinds of signal processing algorithms

      at the physical layer is increased usually by a factor of Nt Nr . Hence, simplicity

      plays an important role in the system design. We propose a pilot tone design

      for MIMO-OFDM channel estimation that Nt disjoint set of pilot tones are

      placed on one OFDM block at each transmit antenna. For each pilot tone set,

      it has L (L: channel length) pilot tones which are equally-spaced and equally-

      powered. The pilot tones from different transmit antennas comprise a unitary

      matrix and then a simple least square estimation of the MIMO channel is easily

      implemented by taking advantage of the unitarity of the pilot tone matrix.

      There is no need to compute the inverse of large-size matrix which is usually

      required by LS algorithm. Contrast to some other simplified channel estimation

      methods by assuming that there are only a few dominant paths among L of them
25


  and then neglecting the rest weaker paths in the channel, our method estimates

  the full channel information with a reduced complexity.


• Estimation of Fast Time-varying Channel:

  In a highly mobile environment, like a mobile user in a vehicle riding at more

  than 100km/hr, the wireless channel may change within one or a small number

  of symbols. But the information packet could contain hundreds of data symbols

  or even more. In the literature [50] there are some preamble designs that the

  wireless channel is only estimated at the preamble part of a whole data packet

  and is assumed to be constant during the transmission of the rest data part.

  Different from the preamble design, our scheme is proposed that we distribute

  the pilot symbols in the preamble to each OFDM block for channel estimation.

  Since the pilot tones are placed on each OFDM block, the channel state infor-

  mation can be estimated accurately and quickly, no matter how fast the channel

  condition is varying.


• Link to SFC (Space-frequency code):

  Usually channel estimation and space-frequency code design of MIMO-OFDM

  systems are taken as two independent subject, especially for those algorithms

  generalized from their counterparts in the SISO (single-input single-output)

  case. Some researchers [48, 50] propose some orthogonal structures for pilot

  tone design and try to reduce the complexity of computing. However, each
26


     individual structure is isolated and it is not easy to generalize their structures to

     the MIMO system with any number of transmit antennas and receive antennas.

     In this dissertation, the orthogonal pilot tone matrix we propose is indeed a

     space-frequency code. The row direction of the matrix stands for different pilot

     tone sets in the frequency domain, and the column direction represents the

     individual transmit antennas in spatial domain. And it can be readily extended

     to an Nt × Nr MIMO system by constructing an Nt × Nt orthogonal matrix.

     With this explicit relation to space-frequency code, the design of pilot-tone

     matrix for MIMO-OFDM channel estimation can be conducted in a more broad

     perspective. This link will shed light on each other.


In the second part of this dissertation, we contribute to the formulation of the lo-

cation estimation into a constrained LS-type optimization problem. As surveyed in

[53], there are different methods for location estimation based on measurements of

TOA, TDOA, AOA and amplitude. There are two problems which are not given full

attention and may increase the complexity of the algorithm. One problem is that only

an intermediate solution can be first obtained by solving the LS estimation problem.

It means that the intermediate solution is still a function of the unknown target loca-

tion. Extra constraints are needed to get the final target estimation. Though such a

constraint exists, solving the quadratic equation may end up with nonexistence of a

real positive root. Another problem is that it is unclear how the measurement noise

variance affect the estimation accuracy. Intuitively, a small variance is always pre-
27


ferred. In our proposed algorithm, the constrained LS-type optimization problem is

solved by using Lagrange multiplier. And it is pointed out that the noise variance is

closely related to the equivalent SNR. For example, in the case of TDOA, the equiva-

lent SNR is the ratio of the time for a signal traveling from the target to the k-th base

station over the noise variance. A smaller noise variance then indicates a higher SNR

which leads to more accurate location estimation. The formulation of a constrained

LS-type optimization has its advantages. First it holds a performance which is close

to the ML algorithm, provided that the assumption about the measurement noise

variance is satisfied. Second it inherits the simplicity from the LS algorithm.


1.3      Organization of the Dissertation

This dissertation is organized as follows. In Chapter 1, the principle of OFDM is

illustrated through instructive figures and the signal mode of OFDM systems is de-

scribed by matrix representation in details. Also, a review of research on channel

estimation for OFDM systems is covered in Chapter 1. In Chapter 2, it is mainly

focused on the pilot tone based channel estimation of MIMO-OFDM systems. It

ends up with intensive computer simulations of different estimation algorithms and

effects of some key OFDM parameters on estimator performance. Chapter 3 devotes

to wireless location on WiMax network. A constrained LS-type optimization problem

is formulated under a mild assumption and it is solved by using Lagrange multiplier

method. Finally this dissertation is summarized in Chapter 5 by suggesting some

open research subjects on the way.
Chapter 2

MIMO-OFDM Channel Estimation

2.1     Introduction

With the ever increasing number of wireless subscribers and their seemingly “greedy”

demands for high-data-rate services, radio spectrum becomes an extremely rare and

invaluable resource for all the countries in the world. Efficient use of radio spectrum

requires that modulated carriers be placed as close as possible without causing any

ICI and be capable of carrying as many bits as possible. Optimally, the bandwidth of

each carrier would be adjacent to its neighbors, so there would be no wasted bands.

In practice, a guard band must be placed between neighboring carriers to provide

a guard space where a shaping filter can attenuate a neighboring carrier’s signal.

These guard bands are waste of spectrum. In order to transmit high-rate data, short

symbol periods must be used. The symbol period Tsym is the inverse of the baseband

data rate R (R = 1/Tsym ), so as R increases, Tsym must decrease. In a multipath

environment, however, a shorter symbol period leads to an increased degree of ISI,

and thus performance loss. OFDM addresses both of the two problems with its



                                         28
29


unique modulation and multiplexing technique. OFDM divides the high-rate stream

into parallel lower rate data and hence prolongs the symbol duration, thus helping

to eliminate ISI. It also allows the bandwidth of subcarriers to overlap without ICI

as long as the modulated carriers are orthogonal. OFDM therefore is considered as

a good candidate modulation technique for broadband access in a very dispersive

environments [42, 43].

   However, relying solely on OFDM technology to improve the spectral efficiency

gives us only a partial solution. At the end of 1990s, seminal work by Foshini and

Gans [21] and, independently, by Teltar [22] showed that there is another alternative

to accomplish high-data-rate over wireless channels: the use of multiple antennas

at the both ends of the wireless link, often referred to as MA (multiple antenna) or

MIMO in the literature [21, 22, 17, 16, 25, 26]. The MIMO technique does not require

any bandwidth expansions or any extra transmission power. Therefore, it provides a

promising means to increase the spectral efficiency of a system. In his paper about

the capacity of multi-antenna Gaussian channels [22], Telatar showed that given a

wireless system employing Nt TX (transmit) antennas and Nr RX (receive) anten-

nas, the maximum data rate at which error-free transmission over a fading channel

is theoretically possible is proportional to the minimum of Nt and Nr (provided that

the Nt Nr transmission paths between the TX and RX antennas are statistically in-

dependent). Hence huge throughput gains may be achieved by adopting Nt × Nr

MIMO systems compared to conventional 1 × 1 systems that use single antenna at
30


both ends of the link with the same requirement of power and bandwidth. With

multiple antennas, a new domain,namely, the spatial domain is explored, as opposed

to the existing systems in which the time and frequency domain are utilized.

   Now let’s come back to the previous question: what can be done in order to en-

hance the data rate of a wireless communication systems? The combination of MIMO

systems with OFDM technology provides a promising candidate for next generation

fixed and mobile wireless systems [42]. In practice for coherent detection, however,

accurate channel state information in terms of channel impulse response (CIR) or

channel frequency response (CFR) is critical to guarantee the diversity gains and the

projected increase in data rate.

   The channel state information can be obtained through two types of methods.

One is called blind channel estimation [44, 45, 46], which explores the statistical in-

formation of the channel and certain properties of the transmitted signals. The other

is called training-based channel estimation, which is based on the training data sent

at the transmitter and known a priori at the receiver. Though the former has its

advantage in that it has no overhead loss, it is only applicable to slowly time-varying

channels due to its need for a long data record. Our work in this thesis focuses on

the training-based channel estimation method, since we aim at mobile wireless ap-

plications where the channels are fast time-varying. The conventional training-based

method [47, 48, 50] is used to estimate the channel by sending first a sequence of

OFDM symbols, so-called preamble which is composed of known training symbols.
31


Then the channel state information is estimated based on the received signals cor-

responding to the known training OFDM symbols prior to any data transmission in

a packet. The channel is hence assumed to be constant before the next sequence of

training OFDM symbols. A drastic performance degradation then arises if applied to

fast time-varying channels. In [49], optimal pilot-tone selection and placement were

presented to aid channel estimation of single-input/single-output (SISO) systems. To

use a set of pilot-tones within each OFDM block, not a sequence of training blocks

ahead of a data packet to estimate the time-varying channel is the idea behind our

work. However direct generalization of the channel estimation algorithm in [49] to

MIMO-OFDM systems involves the inversion of a high-dimension matrix [47] due to

the increased number of transmit and receive antennas, and thus entails high complex-

ity and makes it infeasible for wireless communications over highly mobile channels.

This becomes a bottleneck for applications to broadband wireless communications.

To design a low-complexity channel estimator with comparable accuracy is the goal

of this chapter.

   The bottleneck problem of complexity for channel estimation in MIMO-OFDM

systems has been studied by two different approaches. The first one shortens the

sequence of training symbols to the length of the MIMO channel, as described in [50],

leading to orthogonal structure for preamble design. Its drawback lies in the increase

of the overhead due to the extra training OFDM blocks. The second one is the simpli-

fied channel estimation algorithm, as proposed in [48], that achieves optimum channel
32


estimation and also avoids the matrix inversion. However its construction of the pilot-

tones is not explicit in terms of space-time codes (STC). We are motivated by both

approaches in searching for new pilot-tone design. Our contribution in this chapter

is the unification of the known results of [48, 50] in that the simplified channel esti-

mation algorithm is generalized to explicit orthogonal space-frequency codes (SFC)

that inherit the same computational advantage as in [48, 50], while eliminating their

respective drawbacks. In addition, the drastic performance degradation occurred in

[48, 50] is avoided by our pilot-tone design since the channel is estimated at each block.

In fact we have formulated the channel estimation problem in frequency domain, and

the CFR is parameterized by the pilot-tones in a convenient form for design of SFC.

As a result a unitary matrix, composed of pilot-tones from each transmit antenna,

can be readily constructed. It is interesting to observe that the LS algorithm based

on SFC in this paper is parallel to that for conventional OFDM systems with single

transmit/receive antenna. The use of multiple transmit/receive antennas offers more

design freedom that provides further improvements on estimation performance.


2.2      System Description

The block diagram of a MIMO-OFDM system [27, 28] is shown in Figure 2.1. Ba-

sically, the MIMO-OFDM transmitter has Nt parallel transmission paths which are

very similar to the single antenna OFDM system, each branch performing serial-to-

parallel conversion, pilot insertion, N -point IFFT and cyclic extension before the

final TX signals are up-converted to RF and transmitted. It is worth noting that
33


the channel encoder and the digital modulation, in some spatial multiplexing systems

[28, 29], can also be done per branch, not necessarily implemented jointly over all the

Nt branches. The receiver first must estimate and correct the possible symbol timing

error and frequency offsets, e.g., by using some training symbols in the preamble as

standardized in [37]. Subsequently, the CP is removed and N -point FFT is performed

per receiver branch. In this thesis, the channel estimation algorithm we proposed is

based on single carrier processing that implies MIMO detection has to be done per

OFDM subcarrier. Therefore, the received signals of subcarrier k are routed to the k-

th MIMO detector to recover all the Nt data signals transmitted on that subcarrier.

Next, the transmitted symbol per TX antenna is combined and outputted for the

subsequent operations like digital demodulation and decoding. Finally all the input

binary data are recovered with certain BER.

   As a MIMO signalling technique, Nt different signals are transmitted simultane-

ously over Nt × Nr transmission paths and each of those Nr received signals is a

combination of all the Nt transmitted signals and the distorting noise. It brings in

the diversity gain for enhanced system capacity as we desire. Meanwhile compared

to the SISO system, it complicates the system design regarding to channel estimation

and symbol detection due to the hugely increased number of channel coefficients.

2.2.1    Signal Model

To find the signal model of MIMO-OFDM system, we can follow the same approach

as utilized in the SISO case. Because of the increased number of antennas, the signal
CP

                                                                                                                           1    1
                                                                                             1




                                                                                                                     P/S




                                                                                                  S/P
                                                                                                        IFFT
                                              Data    Channel       Digital         MIMO
                                             source   encoder      modulator       encoder
                                                                                                                CP

                                                                                                                           Nt   Nr
                                                                                             Nt                      P/S




                                                                                                  S/P
                                                            Timing and Frequency                        IFFT
                                                               Synchronization                          De-CP



                                                                                             1
                                                                                                                     S/P




                                                                                                  P/S
                                                                                                        FFT




                                             Data     Channel       Digital         MIMO
                                             sink     decoder     demodulator      decoder              De-CP




Figure 2.1: Nt × Nr MIMO-OFDM System model
                                                                                             Nr
                                                                                                                     S/P




                                                                                                  P/S
                                                                                                        FFT




                                                            Channel estimation
                                                                                                                                     34
35


dimension is changed. For instance, the transmitted signal on the k-th subcarrier in

a MIMO system is an Nt × 1 vector, instead of a scalar in the SISO case. For brevity

of presentation, the same notations are used for both the SISO and MIMO cases. But

they are explicitly defined in each case. There are Nt transmit antennas and hence

on each of the N subcarriers, Nt modulated signals are transmitted simultaneously.

Denote S(m) and S(mN + k) as the m-th modulated OFDM symbol in frequency

domain and the k-th modulated subcarrier respectively as
                                                                         
                     S(mN )                               S1 (mN + k)    
                                                                         
                        .                                         .
   S(m) = 
                       .
                        .
                                  
                                         S(mN + k) = 
                                                                 .
                                                                  .
                                                                            
                                                                            ,   (2.1)
                                                                         
                                                                         
                S(mN + N − 1)                                SNt (mN + k)

where Sj (mN + k) represents the k-th modulated subcarrier for the m-th OFDM

symbol transmitted by the j-th antenna. And it is normalized by a normalization

factor KMOD so that there is a unit normalized average power for all the mappings.

Taking IFFT of S(m) as a baseband modulation, the resulting time-domain samples

can be expressed as
                                                                         
                       s(mN )                        s1 (mN + n)         
                                                                         
                           .                                 .
    s(m) = 
                          .
                           .
                                      
                                         s(mN + n) = 
                                                            .
                                                             .
                                                                            
                                                                            
                                                                             (2.2)
                                                                         
                    s(mN + N − 1)                            sNt (mN + n)
                1
           =    N
                  (FH
                    N   ⊗ INt )S(m)       .

Here IFFT is a block-wise operation since each modulated subcarrier is a column

vector and the generalized N Nt -point IFFT matrix is a Kronecker product of FN and

INt . This is just a mathematical expression. In the real OFDM systems, however,

the generalized IFFT operation is still performed by Nt parallel N -point IFFT. To
36


eliminate the ISI and the ICI, a length-Ng (Ng ≥ L) CP is prepended to the time-

domain samples per branch. The resulting OFDM symbol u(m) is denoted as
                                                                             
                    u(mNtot )                           u1 (mNtot + n)       
                                                                             
                       .
                        .                                      .
                                                                 .              
   u(m) =              .                u(mNtot + n) =        .              .   (2.3)
                                                                             
                                                                             
                u(mNtot + Ntot − 1)                           uNt (mNtot + n)

In a matrix form, there holds


                                  u(m) = ACP s(m),                                   (2.4)


where                                               
                                     0        INg 
                                                  
                                                    
                            ACP =  IN −Ng
                                               0  ⊗ INt .
                                                  
                                                    
                                          0    INg
The time-domain samples denoted by u(m) may be directly converted to RF for

transmission or be up-converted to IF first and then transmitted over the wireless

MIMO channel. For the MIMO channel, we assume in this thesis that the MIMO-

OFDM system is operating in a frequency-selective Rayleigh fading environment and

that the communication channel remains constant during a frame transmission, i.e.,

quasi-static fading. Suppose that the channel impulse response can be recorded with

L time instances, i.e., time samples, then the multipath fading channel between the

j-th TX and i-th RX antenna can be modeled by a discrete-time complex base-

band equivalent (L − 1)-th order FIR filter with filter coefficients hij (l, m), with

l ⊆ {0, . . . , L − 1} and integer m > 0. As assumed in SISO case, these CIR coef-

ficients {hij (0, m), . . . , hij (L − 1, m)} are independent complex zero-mean Gaussian

                   1
RV’s with variance 2 Pl per dimension. The total power of the channel power delay
37

                                                 2
profile {P0 , . . . , PL−1 } is normalized to be σc = 1. Let hm be the CIR matrix and

denote hl,m as the l-th matrix-valued CIR coefficient.
                                                                                     
                  h0,m                            h11 (l, m)   ···   h1Nt (l, m)     
                                                                                     
                   .
                    .                                   .
                                                          .       ...        .
                                                                             .          
        hm = 
                   .     ;
                          
                                          hl,m = 
                                                         .                  .          .
                                                                                        
                                                                                                  (2.5)
                                                                                     
                 hL−1,m                              hNr 1 (l, m) · · · hNr Nt (l, m)

In addition, we assume that those Nt Nr geographically co-located multipath channels

are independent in an environments full of scattering. In information-theoretic point

of view [21, 22], it guarantees the capacity gain of MIMO systems. For the practical

MIMO-OFDM systems, it enforces a lower limit on the shortest distance between

multiple antennas at a portable receiver unit. If the correlation between those chan-

nels exists, the diversity gain from MIMO system will be reduced and hence system

performance is degraded.

   At the receive side, an Nr -dimensional complex baseband equivalent receive signal

can be obtained by a matrix-based discrete-time convolution as

                                    L−1
                 r(mNtot + n) =           hl,m u(mNtot + n − l) + v(mNtot + n),                   (2.6)
                                    l=0


where
                                                                                           
                     r1 (mNtot + n)                                v1 (mNtot + n)          
                                                                                           
                            .                                               .
     r(mNtot + n) = 
                           .
                            .
                                                
                                                    v(mNtot + n) = 
                                                                           .
                                                                            .
                                                                                              
                                                                                               .
                                                                                           
                                                                                           
                              rNr (mNtot + n)                               vNr (mNtot + n)

Note that vi (mNtot +n) is assumed to be complex AWGN with zero mean and variance

of 1 σv per dimension. Therefore, the expected signal-to-noise ratio (SNR) per receive
   2
      2


             Nt
antenna is    2.
             σv
                   In order to have a fair comparison with SISO systems, the power
38


per TX antenna should be scaled down by a factor of Nt . By stacking the received

samples at discrete time instances, r(m) can be described by
                                                               
                                           r(mNtot )           
                                                               
                                               .
                         r(m) = 
                                              .
                                               .
                                                                
                                                                .            (2.7)
                                                               
                                                               
                                      r(mNtot + Ntot − 1)

To combat the ISI, the first Ng Nr elements of r(m) must be removed completely. The

resulting ISI-free OFDM symbol y(m) is
                                                 
                                     y(mN )      
                                                 
                                         .
                    y(m) = 
                                        .
                                         .
                                                  
                                                   = ADeCP r(m),             (2.8)
                                                 
                                                 
                                y(mN + N − 1)

where

                            ADeCP =        0 IN       ⊗ INr .

By exploiting the property that u(m) is a cyclic extension of s(m) so that cyclic

discrete-time convolution is valid, the relation between s(m) and y(m) can be ex-

pressed as

                         y(m) = hm,Cir s(m) + ADeCP v(m),                     (2.9)

where hm,Cir is an N Nr × N Nt block circulant matrix. In general, an N Nr × N Nt

block circulant matrix is fully defined by its first N Nr × Nt block matrices. In our

case, hm,Cir is determined by
                                                     
                                         h0,m        
                                          .          
                                  
                                          .
                                           .          
                                                      
                                                     .
                                                     
                                       hL−1,m        
                                                     
                                                     
                                      0(N −L)Nr ×Nt
39


Finally taking FFT on the y(m) at the receiver, we obtain the frequency domain

MIMO-OFDM baseband signal model

  Y (m) = (FN ⊗ INr )y(m)
          = (FN ⊗ INr )(hm,Cir s(m) + ADeCP v(m))
                                                                                        (2.10)
              1
          = ( N )(FN ⊗ INr )hm,Cir (FH ⊗ INt )S(m) + (FN ⊗ INr )ADeCP v(m)
                                     N

          = Hm,diag S(m) + V (m).

In the above expression, V (m) represents the frequency domain noise, which is i.i.d.

(independent and identically distributed) zero-mean and complex Gaussian random

                       1 2
variable with variance 2 σv per dimension, and Hm,diag is a block diagonal matrix

which is given by                                            
                                    H0,m                     
                                                             
                                   
                         Hm,diag =           ...             
                                                              .
                                                             
                                                             
                                                    HN −1,m
The k-th block diagonal element is the frequency response of the MIMO channel at

                                                              L−1           2π
the k-th subcarrier and can be shown to be Hk,m =             l=0   hl,m e−j N kl . So for that

subcarrier, we may write it in a simpler form


                    Y (mN + k) = Hk,m S(mN + k) + V (mN + k),                           (2.11)


where                                                               
                             H11 (k, m)     ···    H1Nt (k, m)      
                                                                    
                                  .
                                   .         ...         .
                                                         .           
                     Hk,m = 
                                  .                     .           .
                                                                     
                                                                    
                                HNr 1 (k, m) · · · HNr Nt (k, m)
This leads to a flat-fading signal model per subcarrier and it is similar to the SISO

signal model, except that Hk,m is an Nr × Nt matrix.
40


2.2.2    Preliminary Analysis

Based on those assumptions such as perfect synchronization and block fading, we end

up with a compact and simple signal model for both the single antenna OFDM and

MIMO-OFDM systems. Surely it is an ideal model that says, considering first a noise

free scenario, the received signal on the k-th subcarrier is just a product (or matrix

product for MIMO case) of the transmitted signal on the k-th subcarrier and the

discrete-time channel frequency response at the k-th subcarrier. Noise in frequency

domain can also be modeled as an additive term. When it comes to channel estimation

for OFDM systems, this model is still valid since there is no ICI as we assume.

   For channel estimation of MIMO-OFDM systems, it is appropriate to estimate

the channel in time domain rather than in frequency domain because there are few

parameters in the impulse response (Nt Nr L coefficients) than in the frequency re-

sponse (Nt Nr N coefficients). Given the limited number of training data that can be

sent to estimate the fast time-varying channel, limiting the number of parameters to

be estimated would increase the accuracy of the estimation. This is the thrust of the

estimation technique in this thesis. The estimation algorithm we propose is based on

pilot tones, namely known data in the frequency domain. Since the signal model of

OFDM in (2.11) is in the frequency domain too, it is necessary to find the relations

between the CFR and the CIR. Discrete-time Fourier transform is a perfect tool we
41


can use to describe the relation. It is shown as
                                                                 
                                                      hm
                               H m = F N Nr 
                                            
                                                                  
                                                                  ,
                                                  0(N −L)Nr ×Nt

where                                   
                               H0,m     
                                        
                                 .
                    Hm = 
                                .
                                 .
                                         
                                         ;             FN Nr = FN ⊗ INr .
                                        
                                        
                               HN −1,m
Since the channel length L is less than the FFT size N , only the first LNr columns

of FFT matrix FN Nr are involved in calculation. It gives us another form to describe

the relation as

                                 Hm = FN Nr (1 : Nr L)hm ,                           (2.12)

where FN Nr (:, 1 : Nr L) is an N Nr × Nr L submatrix of FN , consisting of its first Nr L

columns. FN Nr (:, 1 : Nr L) is a ’tall’ matrix and its left inverse exists. That implies

the equation in (2.12) is an overdetermined system. To determine hm , we can easily

multiply the left inverse of FN Nr (:, 1 : Nr L) in the two sides of the equation. This

requires full information for the channel frequency response matrix Hm . That is not

necessarily to be true. Actually if we know L of the N matrices {H0,m , . . . , HN −1,m },

then hm can be calculated. For example, in the SISO case, if we know the channel

frequency response at any L subcarriers {Hk1 ,m , . . . , HkL ,m }, then the channel impulse

response h(m) can be uniquely determined. This is the base for pilot-tone based

channel estimation of OFDM systems. Pilot-tones are the selected subcarriers over

which the training data are sent. The question then arises as to which tones should be

used as pilot-tones and the impact of pilot-tones selection on the quality of estimation.
42


Cioffi’s paper [49] addressed this issue first that one should choose the sets of equally-

spaced tones as pilot tones, to avoid the noise enhancement effect in interpolating the

channel impulse response from the frequency response. Assume that N = mL and the

integer m > 1. This is a realistic assumption since the OFDM block size N is often

chosen to be 128, 256 or even a larger value and the channel length of MIMO-OFDM

channel is usually not greater than 30. For the typical urban (TU) model [47] of delay

profile with RMS delay τrms = 1.06µs, the channel length is L = τrms × 20MHz+1

≈ 23 in an 802.11a system with a bandwidth of 20MHz. In systems like DVB-T and

WiMax [40, 41], N is even a much bigger integer. Since N = M L, there could be m

equally-sized pilot tones sets. Define
                                                                       
                  Hp,m                      1                      
                                                                   
     (p)           .
                    .                   (p)     p                   
    Hm = 
                   .         
                              
                                        WN = 
                                                WN                    ⊗ INr ,
                                                                      
                                                                                  (2.13)
                                                            p(L−1)
                                                                      
                Hp+(L−1)M,m                                 WN
                                                                      2π
where p is any integer such that 0 ≤ p ≤ m − 1 and WN = e−j N . Clearly H(p) is
                                                                         m

                                  (p)              (p)
the p-th down-sampled version of Hm , and WN simply acts as a shift operator of

order p. The CFR matrix Hm can be decomposed into M disjoint down-sampled

submatrices {H(p) }M −1 , each composed of L equally-spaced CFR sample matrices. It
              m p=0


can be verified via straightforward calculation that


                    (p)           (p)
                   Hm = FLNr WN hm             p = 0, 1, · · · , M − 1,           (2.14)


where FLNr is a LNr × LNr DFT matrix. It indicates that the channel state infor-

mation represented by hm can be obtained from a down-sampled version of Hm , i.e.,
43

 (p)
Hm , which only requires us to probe the unknown channel frequency response with

some training data on the selected p-th pilot-tones set. The procedure of pilot-tone

based channel estimation is illustrated in Figure 2.2.


                                 S ( P) (m)


                  Y ( P) (m) S ( P) (m)hC (m) V ( P) (m)                         hC (m)


                             Y ( P) (m)

              Figure 2.2: The concept of pilot-based channel estimation


   And it is also true that

                                                            (p)
                                      H(p) (:, i) = FLNr WN hm (:, i),
                                       m                                                             (2.15)


where H(p) (:, i) and hm (:, i) are the i-th column of H(p) and hm respectively and 1 ≤
       m                                                m


i ≤ Nt . After discussing the relation between the CIR hm and the p-th down-sampled

CFR Hm , we return to the input-output relationship of MIMO-OFDM system


                        Y (mN + k) = Hk,m S(mN + k) + V (mN + k),                                    (2.16)


where
                                                                                                     
                   Y1 (mN + k)                         S1 (mN + k)                         V1 (mN + k)
                       .                                 .                                 .         
 Y (mN + k) =          .          ; S(mN + k) =          .          ; V (mN + k) =         .         .
                       .                                 .                                 .         
                  YNr (mN + k)                         SNt (mN + k)                        VNr (mN + k)
44


are the received signal, the transmitted signal and the noise term respectively as

defined in the previous section. They are repeated here for convenience. In order to

get a useful form for channel estimation based on pilot-tones, we have to manipulate

the expression in (2.16) so that the transmitted signal and the CFR terms exchange

their position in the product. (2.16) can be equivalently rewritten as

Y (mN + k) = S1 (mN + k)Hk,m (:, 1) + · · · + SNt (mN + k)Hk,m (:, Nt ) + V (mN + k).

                                                                                                                       (2.17)

Basically we transform the product of a matrix and a vector into a summation of

products of a scalar and a vector. The noise term remains unchanged. This trans-

formation is specified to the k-th subcarrier. If we consider all the N subcarriers, we

need stack {Y (mN + k)}’s and {Hm (:, i)}’s together and construct a block diagonal

matrix for the {S(mN + k)}’s. It can be shown that

       Y (m) = Sdiag,1 (m)Hm (:, 1) + · · · + Sdiag,Nt (m)Hm (:, Nt ) + V (m),                                         (2.18)

where Y (m) and V (m) are the received signal and the noise term respectively given

by
                                                                                                              
                     Y (mN )                                 V (mN )                                H0,m (:, i)
                       .                                     .                                     .          
      Y (m) =          .           ; V (m) =                 .          ;      Hm (:, i) =         .          ,
                       .                                     .                                     .          
                  Y (mN + N − 1)                          V (mN + N − 1)                           HN −1m (:, i)

and
                                                                              
                             Sdiag,i (mN )
                                                                              
        Sdiag,i (m) = 
                      
                                             ..
                                                  .
                                                                                ⊗ IN ;
                                                                                    r         1 ≤ i ≤ Nt .
                                                      Sdiag,i (mN + N − 1)

Here the dimensions of the above column vectors and matrices are very large, for

instance, Y (m) is an N Nr × 1 column vector. The computational load, however, is
45


not changed since Sdiag,i (m) is a block diagonal matrix, compared to the expression

in (2.10).

    As proved in [49], pilot-tones should be equally-powered and equally-spaced to

achieve the MMSE (minimum mean squared error) of channel estimation. Let {Si (mN +

p), Si (mN + M + p), · · · , Si (mN + (L − 1)M + p)} represent a set of L pilot-tones

with index p which are transmitted simultaneously along with the other N − L data

signals at the m-th block from the i-th antenna. Obviously one pilot-tone is placed

every M subcarriers in one OFDM block. Hence we can also have a down-sampled

version of equation (2.18) by selecting a sampled element every M subcarriers. Since

we assume that there is no ICI, we can neglect the data symbol which are transmitted

together with pilot symbol. We only consider the p-th set of pilot-tones on the p-th,

the (p + M )-th,... and the (p + (L − 1)M )-th subcarriers, and so are the received

signals. It turns out to be

                           (p)                                        (p)
      Y (p) (m) = Sdiag,1 (m)Hm (:, 1) + · · · + Sdiag,Nt (m)H(p) (:, Nt ) + V (p) (m),
                              (p)
                                                              m                                                                (2.19)


where
                                                                                                                                   
                        Y (mN + p)                                     V (mN + p)                                     Hp,m (:, i)

 Y (p) (m) =                .
                             .             ;   V (p) (m) =                .
                                                                            .              ;    (p)
                                                                                                Hm (:, i) =               .
                                                                                                                           .            ,
                             .                                              .                                              .
                   Y (mN + (L − 1)M + p)                          V (mN + (L − 1)M + p)                           H(L−1)M +p,m (:, i)



and
                                                                                          
                                 Sdiag,i (mN + p)
          (p)
        Sdiag,i (m) =                              ..
                                                         .
                                                                                            ⊗ IN ;
                                                                                                 r            1 ≤ i ≤ Nt
                                                             Sdiag,i (mN + (L − 1)M + p)



are all the p-th down-sampled versions. In the equation (2.19), we obtain the relation

between Y (p) (m) and H(p) (:, i). To estimate the channel in time domain, we need
                       m
46


explicitly relate Y (p) (m) with hm . Plugging (2.15) into (2.19) yields
                   (p)                    (p)                         (p)                  (p)
Y (p) (m) = Sdiag,1 (m)FLNr WN hm (:, 1) + · · · + Sdiag,Nt (m)FLNr WN hm (:, Nt ) + V (p) (m).
                                                                                         (2.20)

To estimate those unknown {hm (:, 1), · · · , hm (:, Nt )}, one set of pilot-tones is not ad-

equate for estimation. That is different from the SISO case in which any one of the M

pilot-tone sets can be utilized to estimate the channel. For MIMO-OFDM channel es-

timation, we need, at least, Nt disjoint sets of pilot-tones indexed by {p1 , p2 , . . . , pNt }.

It is assumed that N = M L and hence there are totally M = N/L different sets. It

indicates a constraint imposed on the selection of FFT size N for MIMO systesm, i.e.,

N ≥ Nt L. This observation tallies with the result in [48]. In practice, the selection

of N determines the number of subcarriers utilized in the system. For systems like

WLAN and WiMax [39, 40], N is not very large because a larger N means narrower

subcarrier spacing which may cause severe ICI. Furthermore, those systems often

operate in a low SNR environments.


2.3             Channel Estimation and Pilot-tone Design
2.3.1            LS Channel Estimation

Assume that we have Nt disjoint sets of pilot-tones. Then we have the following

observation equations.
                       (p )            (p )                   (p )                (p )
  Y (p1 ) (m) = Sdiag,1 (m)FLNr WN 1 hm (:, 1) + · · · + Sdiag,N (m)FLNr WN 1 hm (:, Nt ) + V (p1 ) (m)
                   1                                        1
                                                                t
  .
  .
  .                                                                                                                      (2.21)
      (pNt )            (pNt )          (pN )                      (pNt )          (pN )                  (pNt )
  Y            (m) =   Sdiag,1 (m)FLNr WN t hm (:, 1)   + ··· +   Sdiag,N (m)FLNr WN t hm (:, Nt )   +V            (m)
                                                                          t



To use LS (least square) method for channel estimation, we usually put those obser-

vation equations into a matrix form. LS is a well-known method and widely used for
47


estimation. We choose LS rather than other methods like MMSE channel estimation

for the simplicity of implementation. In a matrix form, it is described by


                                Y (P ) (m) = S(P ) (m)hC (m) + V (P ) (m),                               (2.22)


where
                                                                                                    
                         (p1 ) (m)                                                           (p1 ) (m)
                    Y                            hm (:, 1)                         V                
                          .                         .                                       .       
      Y (P ) (m) =        .
                           .          ; hC (m) =     .
                                                       .              ; V (P ) (m) =           .
                                                                                                 .       ,
                                                                                                    
                                                                                                    
                     Y (pNt ) (m)                      hm (:, Nt )                       V (pNt ) (m)

and                                                                                             
                                 (p )             (p )               (p )               (p )
                              Sdiag,1 (m)FLNr WN 1
                                  1
                                                           ···    Sdiag,Nt (m)FLNr WN 1
                                                                     1
                                                                                                 
                                         .                ..                 .                  
            S(P ) (m) =                  .
                                          .                   .               .
                                                                              .                  .
                                                                                                
                                                                                                
                                (p   )            (p   )             (p   )             (p   )
                               Sdiag,1 (m)FLNr WN Nt
                                  Nt
                                                           · · · Sdiag,Nt (m)FLNr WN Nt
                                                                    Nt



In the above expression, S(P ) (m) is an Nt Nr L×Nt Nr L square matrix, composed of Nt2
                                         (p )
pilot-tone block matrices {Sdiag,j (m)}Nt . At each transmit antenna Nt sets of pilot-
                              i
                                       i,j=1


tones are transmitted with the same index {p1 , p2 , · · · , pNt }. Assume that Nt ≤ M =

N
L
  .    It can also be seen that the total number of unknown CIR parameters Nt Nr L

cannot be greater than the total number of received signals N Nr , i.e., N tN rL ≤

                                        N
N N r ⇔ Nt L ≤ N ⇔ Nt ≤                 L
                                          .

      The standard solution to the LS channel estimates [50] is known as

                  ˆ
                  hC,LS (m) = [(S(P ) (m))H S(P ) (m)]−1 (S(P ) (m))H Y (P ) (m).                        (2.23)


Obviously the matrix S(P ) (m) is of huge size and it has Nt2 Nr L2 elements. Compu-
                                                               2



tation of the inverse for such a large size matrix is undesirable. Therefore, an intu-

itive solution is to design the square matrix S(P ) (m) such that (S(P ) (m))H S(P ) (m) =
48

                                                                1
S(P ) (m)(S(P ) (m))H = aINt Nr L , a ∈ R+ , or equivalently   √ S(P ) (m)
                                                                 a
                                                                             is a unitary ma-

trix. Then the LS channel estimates can be easily obtained as

                    ˆ                      1
                    hC,LS (m) = hC,LS (m) + (S(P ) (m))H V (P ) (m).                   (2.24)
                                           a

2.3.2     Pilot-tone Design

In order to have a simple and efficient LS algorithm for channel estimation, we have

to design the square matrix S(P ) (m) deliberately. In this section, the design will be

illustrated by a theorem and an example.

   The preamble design discussed in [50] adopted Tarokh’s approach [18] to space-

time block code construction. It could be related to orthogonal design to which

our pilot-tone design also has a connection. In each of the first Nt training blocks

in a frame, a group of at least L pilot-tones are equally-placed and all the other

tones are set to zeros. LS channel estimation can then be obtained based on the

known pilot-tones. The channel is assumed to be unchanged for the rest of the whole

frame. In a mobile environment, however, we cannot guarantee that the channel state

information estimated at the m-th block still holds true at the (m + Nt )-th block.

Hence the preamble design in [50] is not suitable to be applied to the fast time-varying

channels. In addition to this common disadvantage, the training sequences designed

in [48] have to satisfy a condition called local orthogonality. It requires that, for the

Nt different training sequences with length N , they are orthogonal over the minimum

set of elements for any starting position. The pilot design proposed in this paper aims

to remove the disadvantage and the constraint mentioned above. It actually has its
49


roots to Table I in [16], but it is not implemented in space and time domain. On the

contrary, it is accomplished in space and frequency domain. We explicitly connect

pilot-tone design with space-frequency coding so that we have more insights on its

design. Denote EP as the fixed total power for all the pilot-tones at each transmit

                                                                           EP
antenna. Then the power allocated on each pilot-tone is                    Nt L
                                                                                  since pilot-tones are

all equalspaced and equalpowered. In some systems, the power of those pilot-tones

could be larger than the power of data symbols for a better estimation of the wireless

channel. We assume in our work that the pilot-tones and other data are all equally

normalized such that the average power for all different mappings is the same. Our

pilot-tone design is illustrated in the following theorem.

                        (p )                                      EP
Theorem 2.1 Let Sdiag,j (m) = αpi ,j ILNr , |αpi ,j | =
                   i
                                                                  Nt L
                                                                       ,    i, j = 1, 2, · · · , Nt , then

√1    S(P ) (m) is a unitary matrix if
 EP
                                                                                  
                                              (p1 )               (p1 )
                                            Sdiag,1 (m)   ···   Sdiag,Nt (m)      
                                   L                                              
                    (P )                          .
                                                   .       ...         .
                                                                       .           
                   SSF C (m)   =                  .                   .           
                                   EP                                             
                                             (pNt )              (p
                                                                  Nt   )           
                                             Sdiag,1 (m) · · · Sdiag,Nt (m)

is a unitary matrix.
50


Proof.

            S(P ) (m)
                                                                                                   
                         (p1 )      (p )                                (p1 )       (p )
                  FLNr Sdiag,1 (m)WN 1                    ···    FLNr Sdiag,Nt (m)WN 1   
                                                                                         
                                     .                     ...                 .
            =
                                    .
                                     .                                         .
                                                                               .
                                                                                          
                                                                                          
                                                                                         
                        (p      )                (p   )               (pNt )      (pNt ) 
                           Nt
                   FLNr Sdiag,1 (m)WN Nt                   · · · FLNr Sdiag,Nt (m)WN
                                                                                         
                               (p1 ) (p1 )                                (p )     (p )
              FLNr WN Sdiag,1 (m)                         ···    FLNr WN 1 Sdiag,Nt (m)
                                                                               1
                                                                                                    
                                                                                                   
                       .
                        .                                  ..               .
                                                                            .                       
            =
                       .                                     .             .                       
                                                                                                    
                    (pN ) (pN )                                          (p   ) (p       )         
                   FLNr WN        t
                                         Sdiag,1 (m) · · · FLNr WN Nt Sdiag,Nt (m)
                                             t                           Nt


                        (P )             EP     (P )
            = FLNr WN (                   L
                                              )SSF C (m),

where
                                                                                                     
                                                                          (p )
                  FLNr                                                WN 1                            
                                                                                                      
          FLNr = 
                 
                                     ...                   (P )
                                                        , WN = 
                                                                                    ...                 
                                                                                                         .
                                                                                                      
                                                                                               (pNt ) 
                                               FLNr                                           WN

                             H      H                                                      (P )
It is easy to see that FLNr FLNr = FLNr FLNr = LINt Nr L and WN is a unitary matrix.

Hence S(P ) (m)(S(P ) (m))H = (S(P ) (m))H S(P ) (m) = EP INt Nr L . This completes the

proof.                                                                                                        2


   Clearly each of the Nt different pilot-tone sets has the same L elements. That is

because, for example, An×n Bn×n = Bn×n An×n if B = In . Or put it in another way,

we can turn the product AB into BA by moving B to the front of A. It is a simple

manipulation of the mathematical derivation. In general, the product of two square

matrices, AB is not equal to BA. But it turns out to be true if B is a square identity

matrix. Then we can find that this assumption greatly simplifies the pilot-tone design

for a MIMO-OFDM system with a large number of transmit antennas. It reduces to

the design of a square orthogonal matrix. Hence we are more interested in the design
51

    (P )
of SSF C (m). First we consider a simple example with 2 transmit antennas and 2

receive antennas, i.e., Nt = Nr = 2 in the previous equations. Assume the channel

length L = 4. By Theorem 2.1, we use Alamouti’s structure [16]
                                              
                              x       y                        EP
                                               , |x|2 + |y|2 = 4 , x, y ∈ C.
                                   ∗       ∗
                            −y         x

The above leads to the design
                                                                                     
                                                           (p1 )         (p1 )
                        (P )                        4    Sdiag,1 (m)   Sdiag,2 (m)
                                                                            
                       SSF C (m) =                                         ,            (2.25)
                                                   EP S(p2 ) (m) S(p2 ) (m)
                                                        diag,1    diag,2


where
                               (p )                             (p )
                            Sdiag,1 (m)) = xI8 ,
                               1
                                                              Sdiag,2 (m)) = yI8
                                                                 1


                               (p )                             (p )
                            Sdiag,1 (m) = −y ∗ I8 , Sdiag,2 (m) = x∗ I8 .
                               2                       2



The placement of pilot-tones in the example is shown in Figure 2.3. It can be seen

in the figure that red and purple square boxes symbol the first and the second pilot-

tone sets for TX antenna 1 respectively, and so are the green and light blue for TX

antenna 2. They are all equally-spaced and the same color for each set implies that

they are the same pilot symbols. For this example, there are total 16 pilot-tones

and they are allocated to two TX antennas easily by our proposed method. The
                (P )
square matrix SSF C (m) is actually a space-frequency code. In the column direction,

it is signified by the TX antennas, namely the spatial domain; In the row direction,

it is denoted by different pilot-tone sets, namely the frequency domain. Hence our

design explicitly clarifies the connection between conventional pilot-tone design and

the space-frequency code design [32, 33] aiming at performance enhancement.

   When we have more than 2 transmit antennas, i.e., Nt ≥ 3, it is also very easy
52




                          Tx_1   Tx_2
  m-th OFDM symbol




(m+1)-th OFDM symbol 1




                     8



                                        : 1st pilot set @ Tx_1


                                        : 2nd pilot set @ Tx_1

                     16
                                        : 1st pilot set @ Tx_2


                                        : 2nd pilot set @ Tx_2

                                        : data
                     24




                     32


 (m+2)-th OFDM symbol




  Figure 2.3: Pilot placement with Nt = Nr = 2
53

                                                                 (P )
to design an Nt Nr L × Nt Nr L unitary matrix SSF C (m). Based on the assumption

in Theorem 2.1 that all the pilot-tones within one set are all the same, the design
    (P )
of SSF C (m) can be simplified to the design of an Nt × Nt unitary matrix S and the

complexity is reduced from Nt Nr L to Nt :
                                                                         
                                              αp1 ,1    ···    αp1 ,Nt   
                                      L                                  
                                                .
                                                 .       ..       .
                                                                  .       
                          S=                    .          .     .               .
                                      EP                                 
                                                                         
                                                 αpNt ,1 · · · αpNt ,Nt
                                                                          Nt ×Nt
                          −˜ N ij
                           j 2π                                           √
Choose αpi ,j =       EP
                      LNt
                          e    t    , ∀i, j ∈ {1, 2, . . . , Nt }, ˜ =
                                                                   j          −1. Then S can be shown

to be a unitary matrix. Basically it is very close to an Nt -point FFT matrix. After
                                     (P )
obtaining the {αpi ,j }Nt , SSF C (m) can be easily constructed from Theorem 2.1 by
                       i,j=1


mapping a scalar to a diagonal matrix with its diagonal elements all equal to that

scalar.

2.3.3      Performance Analysis

With the fixed total power EP , the pilot-tones designed in the previous section can be

shown to be optimal in the sense that it achieves the minimum mean squared error of

the channel estimation. This is shown in the following. From (2.24), MSE of channel
          ˆ
estimates hC,LS (m) is given by
                           1          ˆ
          MSEm =        Nt Nr L
                                E{    hC,LS (m) − hC,LS (m) 2 }
                            1
                  =      2
                        EP Nt Nr L
                                   E{       (S(P ) (m))H V (P ) (m) 2 }
                                                                                               (2.26)
                            1
                  =      2
                        EP Nt Nr L
                                   tr{(S(P ) (m))H E[V (P ) (m)V (P ) (m)H ]S(P ) (m)}
                            2
                           σn
                  =      2
                        EP Nt Nr L
                                   tr{(S(P ) (m))H INt Nr L S(P ) (m)}.
Since S(P ) (m)(S(P ) (m))H = (S(P ) (m))H S(P ) (m) = EP INt Nr L , then MSE achieves its
                               2
                              σn
minimum as MSEmin =           EP
                                   . At this point, we can find that the unitary matrix design
54


not only reduces the complexity of the channel estimator, but also ensures that it has

the least estimation error, if the pilit-tones have fixed transmit power.


2.4     An Illustrative Example and Concluding
        Remarks
2.4.1    Comparison With Known Result

In this section, we demonstrate the performance of the proposed channel estimation

based on our optimal pilot-tone design through computer simulations. In order to

have a clear look at the performance improvement, other channel estimation technique

[50] is also simulated. We consider a typical MIMO-OFDM system with 2 transmit

antennas and 2 receive antennas. The OFDM block size is chosen as N = 128 and a

CP with length of 16 is prepended to the beginning of each OFDM symbol. The four

sub-channel paths denoted by {h11 , h12 , h21 , h22 } are assumed to be independent to

each other and have a CIR with length L = 16 individually. Those CIR coefficients in

each sub-channel are simulated by the Jakes’ model [51]. Our simulation is conducted

in two ways:


   • Method I: Place two sets of L = 16 pilot-tones into each OFDM block and the

      pilot-tones are equally-spaced and equally-powered as shown in Figure 2.3;


   • Method II: Set the first two OFDM blocks of each data frame, which includes ten

      OFDM blocks, as preamble. Put L = 16 equally-spaced and equally-powered
55


     pilot-tones into each of the first two preamble block and set all the other tones

     as zeros. (see [50] for detailes).


   To illustrate the mobile environments, different Doppler shifts are simulated as

fd = 5, 20, 40, 100 and 200 Hz. The performance of the system is measured in terms

of the MSE of the two different channel estimation schemes mentioned above and the

symbol error rate (SER) versus SNR. For a reliable simulation, total 10,000 frames

are transmitted for each test. Then the average values of MSE and SER are taken as

the measurements. In Figure 2.4, the Doppler shift is 5 Hz and the two curves marked

with “known channel” serve as the performance bound since we know the channel

state information exactly. This is totally unrealistic and is just for the purpose of

comparison. We can find the two curves corresponding to both RX antenna 1 and

RX antenna 2 are nearly merged together. This matches our expectation since there

is no difference between the two receive antennas statistically. It also can be found

that the two curves generated by channel estimation based on our optimal pilot-

tone design is close to the performance bound, just a narrow gap between them

due to the ever-existing channel estimation error. On the contrast, the two curves

generated by channel estimation based on the technique in [50] is far away from the

performance limit, even with a large SNR. It justifies our point that the method

based on preamble at the beginning of a frame is not applicable to a fast varying

wireless channel. Through Figure 2.4 to Figure 2.6, the performance of the system

based on the proposed pilot-tone design does not change a lot since it keeps tracking
56


the channel by the pilot-tones in each OFDM block. The difference between the two

estimation schemes is illustrated in the MSE plots. In Figure 2.7, for a fixed SNR

value, the curves for different Doppler spreads do not change that much and that

implies that the method we proposed is able to track the fast time-varying channel.

For a specific SNR value, the curves in Figure 2.8 do change along with the different

Doppler shifts. It can be seen that the estimation error when fd = 200 Hz is much

larger than the one when fd = 5 Hz in Figure 2.8. It indicates that the method based

on preambles works poorly when Doppler spread is small, and does not work when

the channel is changing quickly.

                                                    fd=5 Hz
                                0
                               10
                                                                      Rx1 KnownChanel
                                                                      Rx2 KnownChanel
                                                                      Rx1 PilotTone−based
                                                                      Rx2 PilotTone−based
                                                                      Rx1 Preamble−based
                                                                      Rx2 Preamble−based



                                −1
                               10
           Symbol Error Rate




                                −2
                               10




                                −3
                               10
                                     5   10   15                 20     25                  30
                                                   SNR (in dB)



        Figure 2.4: Symbol error rate versus SNR with Doppler shift=5 Hz
57


                                                                fd=40 Hz
                        0
                       10




                        −1
                       10
   Symbol Error Rate




                        −2
                       10            Rx1 KnownChanel
                                     Rx2 KnownChanel
                                     Rx1 PilotTone−based
                                     Rx2 PilotTone−based
                                     Rx1 Preamble−based
                                     Rx2 Preamble−based




                        −3
                       10
                                5           10             15                 20   25   30
                                                                SNR (in dB)



Figure 2.5: Symbol error rate versus SNR with Doppler shift=40 Hz


                                                                fd=200 Hz
                            0
                       10




                            −1
                       10
   Symbol Error Rate




                            −2
                       10            Rx1 KnownChanel
                                     Rx2 KnownChanel
                                     Rx1 PilotTone−based
                                     Rx2 PilotTone−based
                                     Rx1 Preamble−based
                                     Rx2 Preamble−based




                            −3
                       10
                                 5          10             15                 20   25   30
                                                                SNR (in dB)



Figure 2.6: Symbol error rate versus SNR with Doppler shift=200 Hz
58


                                                         Normalized MSE of Pilot−tone Based Channel Estimator



                                    −3
                                 x 10

                            4

                           3.5

                            3
          Normalized MSE




                           2.5

                            2

                           1.5

                            1

                           0.5

                       0
                     200

                                        150                                                                               30
                                                                                                                     25
                                                100                                                             20
                                                              50                                   15
                                                                                         10
                                                                         0    5
                                 Doppler Shift (in Hz)
                                                                                                    SNR (in dB)



Figure 2.7: Normalized MSE of channel estimation based on optimal pilot-tone design


                                                         Normalized MSE of Preamble Based Channel Estimator




                                    −3
                                 x 10

                            8

                            7

                            6
          Normalized MSE




                            5

                            4

                            3

                            2

                       1
                     200

                                        150                                                                               30
                                                                                                                     25
                                                100                                                             20
                                                              50                                   15
                                                                                         10
                                                                         0    5

                                 Doppler Shift (in Hz)
                                                                                                    SNR (in dB)



   Figure 2.8: Normalized MSE of channel estimation based on preamble design
59


2.4.2    Chapter Summary

We presented a new optimal pilot-tone design for MIMO-OFDM channel estimation.
                                   (P )
Nt sets of L pilot-tones coded in SSF C (m) are transmitted at each antenna simulta-

neously and the channel can be estimated optimally. The main advantage is rooted

in its ability to handle fast time-varying system since channel can be estimated at

each OFDM block and its simpleness since the orthogonal design makes the MIMO

system be easily processed in a parallel way.

   For an Nt × Nr MIMO system, the complexity of any kinds of signal processing

algorithms at the physical layer is increased usually by a factor of Nt Nr . To name

a few, channel estimation, carrier frequency offset estimation and correction and IQ

imbalance compensation all become very challenging in MIMO case. In this chapter,

we provide solutions to the following “how” questions. How many pilot tones are

needed? How are they placed in one OFDM block? Most importantly, how fast

can channel estimation be accomplished? We propose a pilot tone design for MIMO-

OFDM channel estimation that Nt disjoint sets of pilot tones are placed on one OFDM

block at each transmit antenna. For each pilot tone set, it has L pilot tones which are

equally-spaced and equally-powered. The pilot tones from different transmit antennas

comprise a unitary matrix and then a simple least square estimation of the MIMO

channel is easily implemented by taking advantage of the unitarity of the pilot tone

matrix. There is no need to compute the inverse of large-size matrix which is usually

required by LS algorithm.
60


   In a highly mobile environment, like a mobile user in a vehicle riding at more than

100km/hr, the wireless channel may change within one or a small number of symbols.

For example as in [30], in IEEE 802.16-2004 Standard with N = 256, G = 44 (N :

FFT size; G: guard interval) and 3.5MHz full bandwidth, the symbol duration is

about 73 microseconds. For a user in a vehicle traveling with 100km/hr, the channel

coherent time is about 1100 microseconds. That means the wireless channel varies

after around 15 symbols. In a real-time communication scenario, the information

packet could contain hundreds of data symbols or even more. Our scheme is proposed

in this chapter that we distribute the pilot symbols in the preamble to each OFDM

block for channel estimation. Since the pilot tones are placed on each OFDM block,

the channel state information can be estimated accurately and quickly, no matter

how fast the channel condition is varying. It is fair to point out that we may have a

higher overhead rate compared to the methods in the literature. Therefore our pilot

design can also be applied to a slow time-varying channel by placing pilot tones on

every a few number of OFDM blocks. That can reduce the channel throughput loss.

   The orthogonal pilot tone matrix is indeed a space-frequency code. The row

direction of the matrix stands for different pilot tone sets in the frequency domain, and

the column direction represents the individual transmit antennas in spatial domain.

And it can be readily extended to an Nt × Nr MIMO system by constructing an

Nt × Nt orthogonal matrix. With this explicit relation to space-frequency code, the
61


design of pilot tone matrix for MIMO-OFDM channel estimation can be conducted

in a more broad perspective.
Chapter 3

Wireless Location for
OFDM-based Systems

3.1     Introduction

Wireless networks are primarily designed and deployed for voice and data commu-

nications. The widespread availability of wireless nodes, however, makes it feasible

to utilize these networks for wireless location purpose as an alternative to the GPS

(global positioning system) location service. It is expected that location-based ap-

plications will play an important role in future wireless markets. The commercially

available location technology is implemented on cellular networks and WLAN, such

as E911 (Enhanced 911) and indoor positioning with WiFi (wireless fidelity). In this

dissertation, we are investigating wireless location technology aimed at a different

network, i.e., WiMax system.

3.1.1    Overview of WiMax

WiMax is an acronym for Worldwide Interoperatability for Microwave Access. It is

not only a technical term indicating a new wireless broadband technology, but also is


                                         62
63


referred to as a series of new products working on this network. The real WiMax-based

wireless gears do not come to the market yet. But people are already very familiar

with the WiFi-based products such as notebook wireless cards and wireless routers

from Linksys, D-Link and Belkin, while they are checking their emails or surfing

on Internet wirelessly on campus or at airports, hotels, bookstores and coffee shops.

WiFi stands for Wireless Fidelity and it is the first available technology for WLAN and

wireless home networking. However it is constrained by its limited coverage of about

50-100 meters and relatively low data rate. Different from WiFi, WiMax is another

new broadband wireless access technology that provides very high data throughput

over long distance in a point-to-multipoint and line of sight (LOS) or non-line of

sight (NLOS) environments. In terms of the coverage, WiMax can provide seamless

wireless services up to 20 or 30 miles away from the base station. It also has an IEEE

name 802.16-2004. It is this IEEE standard that defines the specifics of air interface

of WiMax.

WiMax Standards


Actually microwave access is not a new technology for broadband systems. Propri-

etary point-to-multipoint broadband access products from companies like Alcatel and

Siemens have existed for decades. They did not get their popularity because they are

extremely proprietary. Today’s WiMax is attempting to standardize the technology

to reduce the cost and to increase the range of applications. The current standard for

WiMax is IEEE std 802.16-2004. It can be easily downloaded at IEEE website. With
64


its approval in June 2004, it renders the previous standard IEEE std 802.16-2001 and

its two amendments 802.16a and 802.16c obsolete. Now IEEE 802.16-2004 can only

address the fixed broadband systems. IEEE 802.16 Task Group e is working on an

amendment to add mobility component to the standard. The new standard may be

named as IEEE 802.16e.

WiMax Applications


We have seen a lot of marketing efforts on WiMax applications at conferences, exhi-

bitions and other media. People are wondering if it is a must technology in the near

future. Let’s have a look at the fact that what kind of broadband services we can have

today. We usually resort to a landline connection with T1, DSL and cable modems.

WiMax or 802.16 is proposed to address the first mile/last mile wireless connection

in a metropolitan area network. It can change the last-mile connection as much as

802.11 did for the change of the last hundred feet connection. It may change not

only for the rural areas, but also for anyplace where the cost of laying or upgrading

landline to broadband capabilities is prohibitively expensive. WiMax’s primary use

will most likely come in the form of metropolitan area network. In terms of services

and applications, it is different from the traditional WiFi standards which include

802.11a, 802.11b and 802.11g. The WiFi technology with a maximum range of 800

feet outdoors mainly intend to be used in local area networks to provide services for

residential homes, for public hot spots like airports, hotels and coffee shops, and for

small business buildings. With its much longer range, in theory WiMax can reach a
65


maximum of 31 miles, and WiMax can provide broadband services to thousands of

homes in a metropolitan area. Imagine that a broadband service provider can serve

thousands of residential homes, small and large scale business buildings without the

cost of laying out physically running lines and dispatching the technicians for instal-

lations and maintenance of the lines. The savings will push them to choose WiMax

and to reduce the charge fees for their customers. Another driving force for WiMax

is its speed. It can transfer the data with a rate up to 70 Mbps which is equivalent

to almost 60 T1 lines. Combining its long range with the high-speed, it is why the

application of WiMax is endless. All of these sound great enough though, the real

WiMax products are not commercially available in the market yet. There are only

some pre-WiMax products based on the standard coming up. But it will come soon.

For example, Intel’s PRO/Wirelss 5116 is a highly integrated IEEE std 802.16-2004

compliant system on chip for both licensed and license-exempt radio frequencies.

3.1.2    Overview to Wireless Location System

Wireless location refers to determination of the geographic coordinates, or even the

velocity and the heading in a more general sense, of a mobile user/device in a cel-

lular, WLAN or GPS environments. Usually wireless location technologies fall into

two main categories: handset-based and network-based. In handset-based location

systems [55], the mobile station equipped with extra electronics determines its lo-

cation from signals received from the base stations or from the GPS satellites. In

GPS-based estimations, the MS (mobile station) receives and measures the signal
66


parameters from at least four satellites of a currently existing constellation of 24 GPS

satellites. The parameter of which the MS measures is the time for each satellite

signal to reach the MS. GPS systems have a relatively higher degree of accuracy and

they also provide global location information. However, embedding a GPS receiver

into mobile devices leads to increased cost, size and battery consumption. It also re-

quires the replacement of millions of mobile handsets that are already in the market

with new GPS-featured handsets. In addition, the accuracy of GPS measurements

degrades in urban and indoor environments. For these reasons, some wireless carriers

may be unwilling to embrace GPS fully as the only location technology.

   On the other hand, network-based location technology relies on the ever existing

network infrastructures to determine the position of a mobile user by measuring

its signal parameters when received at the network BSs (base stations). This may

require some hardware upgrade or installation at the BSs, but the cost can be shared

by a huge number of mobile subscribers and it does not affect the users in using

their mobile devices. In this technology, the BSs measure the signal transmitted

from an MS and relay them to a central site for further processing and data fusion

to provide an estimate of the MS location. Network-based technologies have the

significant advantage that the MS is not involved in the location-finding process;

thus the technology does not require modifications to existing handsets. However,

unlike GPS location systems, many aspects of network-based location are not yet

fully studied. In Figure 3.1, network-based wireless location technology is illustrated.
67


                                            3




        T(D)OA/AOA
          Estimator                                 r3         2


                                                                                     T(D)OA/AOA
                                                                                       Estimator
                                BS3 (x3, y3 )            r2
                                                    MS
                                       r1



                                                1             BS 2 ( x2, y2 )
            T(D)OA/AOA
              Estimator




                                   BS1 ( x1, y1 )


                                                                       Data Fusion
                                                                         Center




  Figure 3.1: Network-based wireless location technology (outdoor environments)

   Network-based wireless location technology gains more recognition with the in-

creasing number of wireless subscribers and the demands for some location-oriented

services such as E911. It is estimated [56, 57] that location based service will generate

annual revenues in the order of $ 15B worldwide. In U.S. alone, about 170 million

mobile subscribers are expected to become covered by the FCC mandated location

accuracy for emergency services. The following is a partial list of applications that

will be enhanced by using wireless location information [58].


   • E911. Nowadays a high percentage of E911 calls are generated from mobile

      phones; the percentage is estimated [59, 60] to be at one third of all 911 calls

      (170,000 per day). These wireless 911 calls do no receive the same quality of

      emergency assistance as those fixed-network 911 calls enjoy. This is due to
68


  the unknown position of the wireless 911 caller. To fix this problem, FCC

  issued an order on July 12, 1996 [59], requiring all wireless service providers to

  report accurate MS location to the E911 operator at the PSAP (public safety

  answering point). In the FCC order, it was mandated that within five years

  from the effective date of the order, October 1, 1996, wireless service providers

  must convey to the PSAP the location of the MS within 100 meters of its

  actual position for at least 67 percent of all wireless E911 calls. This FCC order

  has motivated considerable research efforts towards developing accurate wireless

  location algorithms for cellular networks and has led to significant enhancement

  to the wireless location technology.


• Mobile advertising. Location-specific advertising and marketing will benefit

  once the location information is available. For example, stores would be able to

  track customer locations and to attract them in by flashing customized coupons

  on their wireless devices [61]. In addition, a cellular phone or a PDA (personal

  digital assistant) could act as a smart handy mobile yellow pages on demand.


• Asset tracking (indoor/outdoor). Wireless location technology can also

  assist in advanced public safety applications such as locating and retrieving

  lost children, patient, or even pets. In addition, it can be used to track per-

  sonnel/assets in a hospital or a manufacturing site to provide a more efficient

  management of assets and personnel. One could also consider application such

  as smart and interactive tour guides, smart shopping guides that lead shoppers
69


  based on their location in a store, smart traffic control in parking structures

  that guides cars to free parking slots. Department stores, enterprises, hospitals,

  manufacturing sites, malls, museums, and campuses are some of the potential

  end-users to benefit from the technology.


• Fleet management. Many fleet operators, such as police force, emergency

  vehicles, and other services including shuttle and taxi companies, can make

  use of the wireless location technology to track and operate their vehicles in

  an efficient way in order to minimize the response time. In addition, a large

  number of drivers on roads and highways carry cellular phones while driving.

  The wireless location technology can help track these phones, thus transforming

  them into sources of real-time traffic information that can be used to enhance

  transportation safety.


• Location-based wireless security.          New location-based wireless security

  schemes can be developed to add a level of security to wireless networks against

  being intercepted or hacked into. By using location information, only people at

  certain specific areas could access certain files or databases through a WLAN.


• Location sensitive billing. Using the location information of wireless users,

  wireless service providers can offer variable-rate call plans or services that are

  based on the caller location.
70


3.1.3    Review of Data Fusion Methods

We assume that the location is specified by (x, y) for simplicity. As shown in Figure

3.1, data fusion center is to determine the mobile user location by exploring all the

estimated signal parameters from BSs. The most common signal parameters are

time, angle and amplitude of arrival of the MS signal. Therefore, different data

fusion algorithms are proposed accordingly. The materials in this section are mainly

based on the survey paper in [53].


   • Time. By combining the estimates of the TOA (time of arrival) of the MS

     signal when received at the BSs, the MS location can be determined in a wireless

     network with three or more BSs. It is illustrated in Figure 3.3. Without loss of




                              BS 3

                                                                                   BS 2
                              ( x3 , y3 )
                                                                           ( x2 , y 2 )
                                            r3                        r2

                                                        ( xT , yT )
                                                 MS


                                                       r1

                                                 BS1
                                                        (0, 0)




               Figure 3.2: TOA/TDOA data fusion using three BSs


     generality, the geometric coordinate of BS1 is assumed to be (0, 0). The location
71


of other BSs are denoted by (xk , yk ), k = 2, 3. Obviously x1 = y1 = 0. Since the

radio signal travels at the speed of light (c = 3 × 108 m/s), the distance between

the MS and BSk is given by


                                 Rk,T = (tk − to )c,                         (3.1)


where to is the time instant when the MS starts transmitting signal and tk is the

time of arrival of the MS signal at BSk . The distances {Rk,T }3 can be used
                                                               k=1


to estimate the MS location (xT , yT ) by solving the following set of equations

                         R1,T = x2 + yT
                          2
                                 T
                                      2

                          2
                         R2,T = (x2 − xT )2 + (y2 − yT )2                    (3.2)
                          2
                         R3,T = (x3 − xT )2 + (y3 − yT )2 .
To solve the above overdetermined nonlinear system of equations, we can refor-

mulate (3.2) into an LS-type presentation by subtracting the first equation from

the second and the third equations respectively. Hence the following equation

is obtained
                      2      2
                     R2,T − R1,T = x2 + y2 − 2(x2 xT + y2 yT )
                                    2
                                         2
                                                                             (3.3)
                      2      2
                     R3,T − R1,T = x2 + y3 − 2(x3 xT + y3 yT ).
                                    3
                                         2


In a matrix form, it can be rewritten as
                                                               
                                               2     2      2
                 x2     y2   xT  1    −  R2  −(R2,T  R1,T )
                                =                  ,                   (3.4)
                    x3   y3     yT   2 R2 − (R2 − R2 )
                                         3    3,T   1,T


where Rk =    x2 + yk is the distance of the base station BSk to the origin point
               k
                    2


                                2
in the coordinate, and clearly R1 = 0. If we have more than three BSs, a

compact form can be obtained in a similar way as


                                       b = Aθ,                               (3.5)
72


where
                                                       
                 2     2      2
               R2 − (R2,T − R1,T )
                                               x2 y 2             
                                                     
               2      2      
                              2                        
               R3 − (R3,T − R1,T )
                                               x3 y 3           x
            1
            
        b= 2                 ; A =                     ; θ =  T .
                                                                     
                                                        
             R2 − (R2 − R2 )        x               y4         yT
             4      4,T  1,T        4                  
                     .              .                  
                         .
                         .                         .
                                                   .

A standard LS estimation of θ is given by

                               ˆ
                               θ = (AT A)−1 AT b.                           (3.6)

           2
Note that R1,T is a function of xT and yT as defined in 3.2. Hence (3.6) only

provides an intermediate solution and the estimates xT and yT can be obtained
                                                    ˆ      ˆ

by solving the resultant quadratic equation. And clearly the TOA data fusion

method requires perfect timing between the MS and the BSs since a small offset

of a few microseconds between the MS clock and the BS clock will reflect into

hundreds of meters of errors in location estimate. But the current wireless

network standards only mandate tight timing synchronization among BSs [62].

The accuracy of TOA method is heavily dependent on the timing between BS

and MS. There is another alternative of using the TDOA (time difference of

arrivals) which help avoid the MS clock synchronization problem. Define the

TDOA associated with the base station BSk as ∆tk,1 = tk −t1 , i.e., the difference

between the TOA of the MS signal at the BS BSk and BS1 . Then the difference

between Rk,T and R1,T can be related to ∆tk,1 as

                        ∆Rk,1 = Rk,T − R1,T
                                = (tk − to )c − (t1 − to )c                 (3.7)
                                = ∆tk,1 c.
73


  Clearly it is seen that the possible timing error on the MS clock to is canceled

  out. This insensitivity to to gives TDOA method the advantage over TOA. By

  substituting Rk,T = (R1,T + ∆Rk,1 )2 in (3.2) and rearranging some terms, we
                2



  can obtain the following LS expression for any number of base stations as

                                    R1,T c + d = Aθ,                          (3.8)

  where                                                            
                                                          2     2
                       −∆R2,1              −          R2   ∆R2,1   
                                                                   
                                        2      2                   
                       −∆R3,1           R − ∆R3,1                  
                    c=         ; d = 1  3                          .
                                     2                             
                       −∆R              R2 − ∆R2                   
                           4,1          4      4,1                 
                         .                   .                     
                          .     .               .             .
  Notice that R1,T = x2 +yT is not known and hence only an intermediate solution
                      T
                          2



  can be obtained from the above LS formulation

                             ˆ
                             θ = (AT A)−1 AT (R1,T c + d).                    (3.9)

  Since
                                      ˆ   2      2
                                      θ       = R1,T ,                      (3.10)

  we can substitute (3.9) into (3.10) and solve R1,T from the resulting quadratic
                                ˆ
  equation. A final solution for θ can be subsequently obtained by substitute the

  positive root of the quadratic equation into (3.9).


• Angle. The AOA (angle of arrival) can be obtained at a BS by using an an-

  tenna array. The direction of arrival of the MS signal can be calculated by

  measuring the phase difference between the antenna array elements or by mea-

  suring the power spectral density across the antenna array in what is known
74


as beamforming [64]. Intuitively, the MS location can be estimated by com-

bining the AOA estimates from two BSs as shown in Figure 3.3. Compared




                                             BS2
                              ( x2 , y 2 )
                                                               2

                                R2                      R2,T
                                                                       MS
                                                                        ( xT , yT )
                                             R1,T
                                     1
                                                    1



            (0,0)       BS1




                        Figure 3.3: AOA data fusion with two BSs


to TOA/TDOA methods, the number of BSs needed for location is relatively

smaller and there is no need for timing synchronization between base stations

and MS clocks. However, one disadvantage is that antenna array used at the

BS which is not available in 2G systems. It is planned for 3G cellular systems

such as UMTS and CDMA2000 [65, 66]. As indicated in Figure 3.3, we have
                                                                                             
    xT   R1,T cos(β1 )    xT   x2   R2,T cos(β2 ) 
       =               ;     =    +               ,                                             (3.11)
       yT               R1,T sin(β1 )                   yT             y2             R2,T sin(β2 )

where


                          2      2
        R2,T =           R1,T + R2 − 2R1,T R2 cos(α1 − β1 ) = f (α1 , β1 , R1,T , R2 ).
75


  Since α1 , β1 and R2 is known, we simply denote R2,T as a function of R1,T as

  R2,T = f2 (R1,T ). If there are more than two BSs, an LS formulation can be

  obtained by collecting the relations in (3.11) into a single equation as


                                        b = Aθ,                              (3.12)


  where                                                        
                             R1,T cos(β1 )               1 0 
                                                             
                                                             
                      
                      
                            R1,T sin(β1 )          
                                                   
                                                           0 1 
                                                               
                                                             
                  b =  R2 + f2 (R1,T ) cos(β2 )
                      
                                                   ; A =  1 0 .
                                                              
                                                             
                                                             
                       R2 + f2 (R1,T ) sin(β2 )          0 1 
                                                             
                                 .                       .   
                                    .
                                    .                         .
                                                              .
  The LS solution for x is then

                                  ˆ
                                  θ = (AT A)−1 AT b.                         (3.13)


  Since this intermediate solution involves the unknown R1,T , we have to utilize

  the relation in (3.10) to get the positive root of the quadratic equation and then
                                                        ˆ
  substitute R1,T back to (3.13) for a final solution of θ.


• Amplitude. Amplitude-based wireless location technology is mainly used in

  indoor environments where WLAN standards such as 802.11a and 802.11g have

  been widely adopted. The WLAN connectivity has also become a standard

  feature for laptop computers and PDAs. As such, there is an increasing interest

  in utilizing these networks for location purposes to help provide a good coverage

  for indoor scenario. In 802.11b and 802.11g MAC layer, the information about
76


the signal strength and the signal-to-noise ratio is provided. Hence, a software-

level location technique could be developed for WLAN networks based on the

amplitude of arrival of the MS signal at different access points [67, 68, 69].

Specifically, when an IEEE 802.11 networks operate in the infrastructure mode,

there are several APs (access point) and many end users within the network.

RF-based systems that use the signal strength for location purposes can monitor

the received signal strength from different APs and use the obtained statistics

to build a conditional probability distribution network in order to estimate the

location of the mobile client. These schemes usually work in two stages. The

first stage is the offline training and data gathering phase and the second stage is

the location determination phase using the online signal strength measurements.

In the training phase, signal strength measurements are used to build an a priori

probability distribution of the received signal strength at the mobile user from

all APs. Assume there are Na APs in the system and the radio map is created

based on measurements from Nu user locations. It is illustrated in Figure 3.4.

The radio map model is described by [67]. Define p(Ai | xj , yj ) as the probability

density function of the received signal strength from the i-th AP at the j-th

measurement point (xj , yj ). After constructing a Bayesian network, the online

determination phase uses maximum likelyhood estimation to locate the mobile

user. Thus assume that the mobile user measures the received signal strength

from all APs, say Ai , i = 1, 2, . . . , Na . Then by Bayes’ rule, the probability of
77



                                                                       p( Ai | x4 , y4 )


                                  AP1
                                                 p ( Ai | x3 , y3 )                   p ( Ai | x5 , y 5 )


                                                                                             AP6
                                                                                                            p( Ai | x6 , y6 )

                                   p( Ai | x2 , y2 )
                                                                  AP7                                                         p( Ai | x7 , y7 )

                                             p ( Ai | x1 , y1 )                                                       AP5
                         AP2                                                                 AP4

                                                                      p ( Ai | x10 , y10 )
                                                                                                                             p ( Ai | x8 , y8 )

                                  AP3                                                                   p ( Ai | x9 , y9 )




             Figure 3.4: Magnitude-based data fusion in WLAN networks

     having the mobile user at location (xj , yj ) given the received signal strengths

     from all APs is given by

                                                 A = [A1 , . . . , ANa ]T
                                                                         p(A|xj ,yj )p(xj ,yj )
                               p(xj , yj | A) =                                  p(A)
                                                                                                                                                  (3.14)
                                                                                     Na
                                                                         p(xj ,yj ) i=1 p(Ai |xj ,yj )
                                                            =                        p(A)
                                                                                                       ,

               Na
     where     i=1   p(Ai | xj , yj ), 1 ≤ j ≤ Nu is the approximation for the conditional

     probability density function of the received signal strength when the location of

     the mobile is given. Thus the location of the mobile user can be estimated as


                        (ˆT , yT ) = arg max p(xj , yj | A)
                         x ˆ                                                                                  1 ≤ j ≤ Nu .                        (3.15)
                                               xj ,yj



   We note that the location problem has been tackled by the LS approach as above.

See also [53] for more details. However several problems exist. The first one is that
78


it is unclear for the physical meaning of these LS solutions, because of the lack of

the statistical information on the measurements of the TOAs, TDOAs, AOAs and

amplitudes, and the impact in transforming the nonlinear estimation for wireless

location into quasi-linear estimation. This problem will be investigated in this thesis

for location based on TDOAs and AOAs. The second one is the nuisance variables

Rk,T , the distance from the k-th BS to the MS which is really unknown. Although

we can use roots solving method, it works only if no noise is involved in measurement

data and often no positive real roots exist. We will convert it into a constrained LS

problem and provide a solution algorithm in this thesis. The final problem is location

using more than one type of measurements. Because of the timing difficulty and lack

of training, we will consider only measurements of TDOAs and AOAs for wireless

location.


3.2         Least-square Location based on TDOA/AOA
            Estimates
3.2.1       Mathematical Preparations

Estimation problem, simply speaking, is to guess what you do not know base on

what is given to you. In terms of its mathematical fundamentals, it is to estimate the

unknown parameters based on some observation data by using some criteria which

leads to an optimal estimator. The observation data usually is a function of the

unknown parameters, either a linear function or a nonlinear one. For simplicity, let’s
79


begin with a generic linear model as follows:

                                      Z = Hθ + V .                                   (3.16)

In this model, Z, of size N × 1, is called the measurement vector ; θ, of size n × 1, is

called the parameter vector ; H, of size N × n, is called the observation matrix and V ,

the same size as Z, is called the measurement noise vector. Because V is random,Z is

random too. Both H and θ can be either deterministic or random. This is determined

by the specific applications. Because of the simplicity, linear models are widely used

in practice. Even in the case of nonlinear models, quasi-linear models that are close

to nonlinear models are often pursued as in this thesis.

   Here a question follows the linear model above: “How can we have the best esti-

mate of θ if we only know Z?” This can be viewed as that we have made N times

of independent experiments in order to estimate θ, which is composed of n unknown

elements {θ1 , θ2 , . . . , θn }, where n < N . Inevitably, the experiment data is corrupted

by some noise which is usually assumed to be additive Gaussian. To answer the

question, there are generally three types of criteria to seek for the best estimate of

θ in the field of statistical signal processing. They are weighted least-square estima-

tion (WLSE), minimum mean square estimation (MMSE) and maximum-likelihood

estimation (MLE).

                                                                                 ˆ
   1. WLSE: It is the simplest method with the oldest history. The best estimate θ

      can be obtained by minimizing the cost function

                                  ˆ          ˆ          ˆ
                                J[θ] = [Z − Hθ]T W[Z − Hθ],                          (3.17)
80


     where W = WT > 0 is the weighting matrix.


  2. MMSE: The optimal estimate minimizes the error variance. Given the mea-

     surements {Z(i)}N , we shall determine an estimate of θ
                     i=1


                                ˆ
                                θ = f [z(1), z(2), . . . , z(N )]              (3.18)


     such that the mean squared error

                                    ˆ          ˆ        ˆ
                                  J[θ] = E[θ − θ]T [θ − θ]                     (3.19)


     is minimized.


  3. MLE: It aims to maximize the likelihood function. Suppose that the measure-
                                                                            ˆ
     ment data {Z(i)}N are jointly distributed with a density function p(Z; θ). The
                     i=1


     optimal estimate is given by

                                    ˆ                   ˆ
                                    θopt = arg max p(Z; θ).                    (3.20)
                                                 ˆ
                                                 θ

                                                                            ˆ
     It is usually a nonlinear estimator since the likelihood function p(Z; θ) is non-

     linear with respect to θ(k). Hence the computational load could be high.


Then, how do we know whether or not the result obtained from one particular method

is good? Or why is it better than other methods? We learn that, to answer this

question, we must make use of the fact that all estimators represent transformations

of random data and hence the estimate itself is random so that its properties must be

studied from a statistical viewpoint. In this section, we introduce some fundamental
81


concepts such as unbiased estimator and efficient estimator, Cramer-Rao bound and

Fisher information matrix [72].


Definition 3.1 (Unbiasedness [72] ) Suppose that the parameter vector θ is deter-
                       ˆ                  ˆ
ministic. An estimator θ is unbiased if E{θ} = θ.


An unbiased estimate indicates that its mean value is the same as the true parameter

vector. Hence as the number of observation increases, the estimate is assured to

converge to the true parameter. However the unbiasedness itself is not adequate. We

must study the dispersion about the mean, the variance of the estimator. Ideally,

we would like our estimator to be unbiased and to have the smallest possible error

variance.

                                                     ˆ
Definition 3.2 (Efficiency [72] ) An unbiased estimate, θ of vector θ is said to be
                                                ˜
more efficient than any other unbiased estimator, θ, of θ, if

                              ˆ      ˆ              ˜      ˜
                       E{[θ − θ][θ − θ]T } ≤ E{[θ − θ][θ − θ]T }.               (3.21)


A more efficient estimator has the smallest error covariance among all the unbiased
                                                     ˆ      ˆ              ˜      ˜
estimators of θ, “smallest” in the sense that E{[θ − θ][θ − θ]T } − E{[θ − θ][θ − θ]T }

is negative semidefinite. Normally it does not make much sense to compare each

pair of unbiased estimators. A lower bound, called CRB (Cramer-Rao Bound), about

the minimum error variance achievable over all unbiased estimates exists and the

efficiency of an unbiased estimator can be used to measure by how close it is to the

CRB. The following theorem presents the CRB.
82


Theorem 3.1 (Cramer-Rao Bound [72] ) Let Z denote a set of N observation data,

i.e., Z = [z(1), z(2), . . . , z(N )]T which is characterized by the probability density func-
                        ˆ
tion p(Z; θ) = p(Z). If θ is an unbiased estimate of the deterministic θ, then the error
                           ˆ         ˆ
convariance matrix, E{[θ − θ(k)][θ − θ(k)]T }, is bounded from below by

                                    ˆ         ˆ
                             E{[θ − θ(k)][θ − θ(k)]T } ≥ J−1 ,                              (3.22)


where J is the Fisher information matrix, defined by
                                                                               
                           ∂                                               T
                                                     ∂
                      J=E      ln p(Z(k))               ln p(Z(k))                  ,       (3.23)
                           ∂θ                       ∂θ                         


which can also be expressed equally as

                                           ∂2
                                J = −E         ln p(Z(k)) .                                 (3.24)
                                           ∂θ2

Note that, for the theorem to be applicable, the vector derivatives in (3.23) must

exist and the norm of ∂p(Z)/∂θ must be absolutely integrable. Clearly, to compute

the Cramer-Rao lower bound, we need to know the probability density function p(Z).

Often the exact information on p(Z) is not available, for which we cannot evaluate

this bound. However, in the case of normal distribution, i.e.,

                                           1                [Z−µ]T C −1 [Z−µ]
                         p(Z; θ) =                     e−           2           ,           (3.25)
                                     (2π)N/2 |C|   1/2


where µ and C are, respectively, the mean and the convariance matrix of Z. Then we

can compute the Cramer-Rao bound corresponding to the Gaussian data distribution

by the Slepian-Bangs formula [74]
                                                                                       
                                                                       T
                            1       ∂C −1 ∂C     ∂µ                            ∂µ 
                  [J−1 ]ij = tr C−1     C     +                           C−1      .       (3.26)
                            2       ∂θi   ∂θj    ∂θi                           ∂θj
83


Because of the central limit theorem, Gaussian distribution holds approximately in

applications such as location estimation.

3.2.2     Location based on TDOA

In this section, we investigate location estimation algorithms based on the measure-

ments of TDOA and AOA. For simplicity, we assume that the mobile users travel

at a low speed and can be taken as stationary targets approximately. Hence we do

not consider the estimation of velocity of mobile users. Basically we explore all the

available measurements {∆tk,1 }Nb (TDOA data) and {βk }Nb (AOA data), where Nb
                               k=2                     k=1


is the total number of base stations to determine the location of the mobile user or

the target, i.e., (xT , yT ). It is seen that we consider only two-dimensional localization

that is adequate, if the terrain elevation is known a priori or it could be neglected

compared to the heights of the antenna towers.

   We start with stationary target estimation based on the measurements of TDOA.

As defined in section 3.1.3,

                  Rk,T   =    (xT − xk )2 + (yT − yk )2
                  ∆tk,1 = (Rk,T − R1,T )/c                                             (3.27)
                         = ( (xT − xk )2 + (yT − yk )2 −       x2 + yT )/c.
                                                                T
                                                                     2


Besides the measurements {∆tk,1 }Nb , the locations of all the base stations {(xk , yk )}Nb
                                 k=2                                                     k=1


are also assumed to be known. Clearly ∆tk,1 is a nonlinear function of the un-

known (xT , yT ), i.e., ∆tk,1 (xT , yT ). Here, for brevity of notation, (xT , yT ) is omitted

in TDOAs unless it is needed for clarification.

                                   ˆ       ˆ               ˆ
   For all the TDOA measurements {∆t2,1 , ∆t3,1 , . . . , ∆tNb ,1 }, it is unavoidable that
84


there are measurement noises embedded within the data. Therefore, the measurement

data are described by

                                          ˆ
                                         ∆tk,1 = ∆tk,1 + δtk ,                           (3.28)

where {δtk }Nb are assumed to be i.i.d. (independent and identical distributed) Gaus-
            k=2

                                                   2
sian random variables with zero mean and variance σt . It is an important but fair

assumption given the fact that all the BSs are well synchronized and it is much less

likely that a large deviation from the mean occurs. Since δtk is a Gaussian random

                 ˆ
variable, so is ∆tk,1 . Based on the above assumption, we can define the (Nb − 1) × 1

                                              ˆ
multivariate Gaussian random variable vector ∆t and the associated mean m∆t and
                                                                          ˆ


covariance matrix M∆t respectively as
                    ˆ
                                                            
                         ˆ
                         ∆t2,1                 ∆t2,1         
                                                            
                          .                       .
             ∆t = 
              ˆ          .
                          .
                                   
                                   ;    m∆t = 
                                           ˆ     .
                                                  .
                                                                         2
                                                                ; M∆t = σt I(Nb −1) .
                                                                     ˆ                   (3.29)
                                                            
                                                            
                          ˆ
                         ∆tNb ,1                     ∆tNb ,1

                                     ˆ
As shown in [71], the joint PDF for ∆t is given by

          ˆ
       p(∆t) =      √
                            1√                       ˆ     ˆ      ˆ
                                                                      ˆ
                                           exp[− 1 (∆t − m∆t )T M−1 (∆t − m∆t )]
                                                                            ˆ
                   ( 2π)Nb −1 det M∆t            2               ∆t
                                    ˆ
                                                     ˆ
                                                                                         (3.30)
                          1                     Nb (∆tk,1 −∆tk,1 (xT ,yT ))2
               =    √          N −1     exp[−   k=2         2σt2             ].
                   ( 2π)Nb −1 σt b

This joint Gaussian PDF can completely describe the statistical characteristics of the

measurement data and itself is affected by the two unknowns xT and yT . With a

fixed data set of measurements, there must be only one pair of (xT , yT ) such that the

set of data is the most likely to occur. In light of the estimation theory, maximum-

likelihood (ML) method can be explored to estimate the target location (xT , yT ).

Before providing the ML estimator, as shown in Theorem (3.1), we would like to
85


compute the Fisher information matrix and the Cramer-Rao bound such that we

know how close the estimation can be. The Cramer-Rao bound is a benchmark

for evaluating different types of unbiased estimators. Let P and JFIM denote the

estimation error convariance matrix and the Fish information matrix. It holds for

any type of unbiased estimator [72] that


                                                         P ≥ J−1 .
                                                              FIM                                                  (3.31)


According to the Slepian-Bangs formula, the Fisher information matrix based on

(3.30) can be calculated by

               1       ∂M∆ˆ −1 ∂M∆ˆ           ∂m ˆ        ∂m ˆ
      Jtdoa = [ tr{M−1
                    ∆ˆ ∂χ
                     t
                            t
                              M∆ˆ
                                t ∂χ
                                       t
                                         } + ( ∆t )T M−1 ( ∆t )]2,2 ,
                                                      ∆ˆ ∂χ
                                                       t        i,j=1,1                                            (3.32)
               2          i          j         ∂χi          j


                                         2                            2
where χ1 = xT and χ2 = yT . Since M∆t = σt INb −1 is only related to σt , the first
                                    ˆ


term in (3.32) is zero. By direct calculations, we have
                                                                                                
                 1 xT −x2          xT                                  1 yT −y2          yT
                 (
                 c R2,T
                              −      ) 
                                  R1,T                                 (
                                                                       c R2,T
                                                                                    −   R1,T
                                                                                            ) 
                                                                                           
              1 ( xT −x3     − R1,T ) 
                                 xT                              1 ( yT −y3        − Ry1,T ) 
                                                                                         T
  ∂ m∆t                                            ∂ m∆t                                   
   ∂χ1
       ˆ
           =  c R3,T .
             
                                              ;
                                                     ∂χ2
                                                          ˆ
                                                              =  c R3,T .
                                                                
                                                                                                    ; M−1 = 12 IN −1 .
                                                                                                        ˆ   σt
                         .
                          .                                                .
                                                                             .                         ∆t        b
                                                                                                
                                                                                                
                 1 xT −xNb          xT                                1 yT −yNb          yT
                  (
                 c RNb ,T
                              −    R1,T
                                          )                            (
                                                                      c RNb ,T
                                                                                    −   R1,T
                                                                                               )
                                                                                                                   (3.33)

Then it is easy to obtain Jtdoa as the follow by substituting (3.33) into (3.32),
                  ∂ m∆t T       ∂m ˆ
   Jtdoa = [(      ∂χi
                       ˆ
                         ) M−1 ( ∂χ∆t )]2,2
                              ˆ
                             ∆t         i,j=1,1
                                  j         
                 Nb               xT −xk           xT
                     1            Rk,T
                                              −   R1,T        xT −xk                   yT −yk          yT
             =                                                                xT
                    2σ2
                                                                       −          ,              −
               k=2 c t
                                  yT −yk           yT           Rk,T         R1,T        Rk,T          R1,T
                                   Rk,T
                                              −   R1,T
                                                             
                 Nb
                     1  cos(βk ) − cos(β1 ) 
             =      2 2
                                                                    cos(βk ) − cos(β1 ), sin(βk ) − sin(β1 )       ,
               k=2 c σt   sin(βk ) − sin(β1 )
                                                                                                                   (3.34)
86


where {βk }Nb are shown in Figure 3.3 with tan(βk ) = (yT − yk )/(xT − xk ). By taking
           k=1


an inverse of the Fisher information matrix Jtdoa , the resultant matrix will be a lower

bound of estimation error covariance for all the unbiased estimators.

   In terms of the large-sample property, the ML estimate approaches the Cramer-

Rao bound asymptotically, i.e, with an infinite number of data measurements. From

(3.30), the ML location estimator seeks (xT , yT ) to minimize the log-likelihood func-

tion of the form

                        Nb                                                                2
     L∆t (xT , yT ) =           ˆ
                              c∆tk,1 −          (xT − xk )2 + (yT − yk )2 +   x2 + y T .
                                                                               T
                                                                                     2
                                                                                              (3.35)
                        k=2


This is obtained by using the fact that e−x is a monotonically decreasing function and

scaling with a constant c2 σt does not affect the likelihood function. There are two
                            2



unknowns in L∆t (xT , yT ), namely xT and yT . Differentiating L∆t (xT , yT ) with respect

to each and equating the resulting partial derivatives to zero gives the following

necessary condition for the optimal solution (x∗ , yT )
                                               T
                                                    ∗

                                                                                   
                   Nb       xk −x∗        x∗
                        
                                 T
                                     +     T
                                                                            0 
                        
                             Rk,T
                                 ∗
                                         R1,T
                                           ∗
                                                     ˆ
                                                 (c∆tk,1 − Rk,T + R1,T ) =    .            (3.36)
                            yk −yT        yT
                 k=2
                             Rk,T
                                     +   R1,T
                                                                                  0

The ML estimator is well studied and widely used in practice, especially in some

applications which require high accuracy of estimation and computational complexity

can be afforded via commercially available hardware and software. It has a variety of

statical properties which is preferred in applications:


   • It is unbiased: the expectation of the estimate is equal to the real value;


   • It is the most efficient: it achieves the minimum error variance;
87


   • It is consistent: it converges to the real value in probability.

Hence it is plausible to apply ML to our estimation problem for the highest pos-

sible accuracy of localization. However, solving the optimal solution (x∗ , yT ) from
                                                                        T
                                                                             ∗



(3.35) and (3.36) is not easy and involves nonlinear procedures such as Newton-type

algorithms which are not discussed in this dissertation. The maximization of the

likelihood function can be done by hands with some PDFs and even the commercial

software does not guarantee to reach the ML solution because of the possible exis-

tence of the local minimum. In this thesis we take a quasi-linear approach as in [54]

to convert the nonlinear optimization problem into a linear one that leads to an LS-

type problem in order to simplify the solution algorithm. Or we can use the LS-type

solution as an initial solution candidate in the Newton-type iterative algorithms to

ensure the fast convergence to the true ML solution (x∗ , yT ). For this purpose of
                                                      T
                                                           ∗



bypassing the difficulty and complexity of the original ML estimator, we notice that

the second equation in (3.27) leads to

                                                                                       2
                      (xT − xk )2 + (yT − yk )2 =            x2 + yT + c∆tk,1
                                                              T
                                                                   2
                                                                                           .              (3.37)


By expanding and rearranging the terms, the above can be written as
                                                       
                      1 2    2                  xT      2    2
                         R = 2
                        2 k       xk yk             + ∆tk,1 + R1,T ∆tk,1 .                              (3.38)
                      c     c                      yT                      c

Packing all the equations in (3.38) for k = 2, 3, . . . , Nb yields
                                                                                        
             2
            R2                                            ∆t2
                      x2      y2                         2,1         ∆t2,1              
   1               2  .              xT                            2                    
            .
             .                  .                        .           .                 
            .    = 2 . .       .
                                  .       +                .
                                                               .       +   .
                                                                             .                  R1,T .   (3.39)
   c2             c                                                c                    
                                     yT                                                 
             2
            RNb              xNb yNb                        ∆t2 b ,1
                                                              N                    ∆tNb ,1
88


If we have the perfect TDOA information, the target (xT , yT ) is uniquely located

with any 2 out of the Nb − 1 sets of data since it is an over-determined problem.

To estimate the target location (xT , yT ) in (3.39), however, we have to replace the

                                                                   ˆ
perfect time difference ∆tk,1 with the available TDOA measurements ∆tk,1 . It then

                                             ˆ
introduces a noise vector as follows, since ∆tk,1 = ∆tk,1 + δtk .
                                                                                         
                 η2           δt2                           ˆ
                                                                ∆t2,1 δt2              δt2
                                                                                           2    
                            2 .                                                          
                 .                                            .                    .     
                 .        =−  .               R1,T − 2        .           +        .     .   (3.40)
                 .          c
                                  .                              .                    .     
                                                                                         
                    ηNb                  δtNb                   ˆ
                                                               ∆tNb ,1 δtNb             δt2 b
                                                                                          N


Each element of the noise vector is composed of the TDOA measurement noise δtk

and the corresponding squared term. Taking expectation at both sides of (3.40), we

                                                        2
find that each element of the noise vector is with mean σt . In an effort to obtain an

LS-type formulation, we define

                                 2        ˆk,1
                           ak = Rk /c2 − ∆t2 − σt ,
                                                2                             ˆ
                                                                       bk = 2∆tk,1 /c.               (3.41)


We can regard {ak } and {bk } as pseudo-measurements that leads to a constrained

linear model:
                                                                                         
                                                                                         2
       a2        b2                      x2         y2           η2 −              σt    
                                      2  .                 xT                            
       .         .                                   .            .                    
       .       − .             R1,T = 2  .           .         +    .                         (3.42)
       .         .                          .          .                .
                                                                                              
                                        c                                                   
                                                           yT                            
                                                                                          2
          aNb              bNb                       xNb yNb                       ηNb − σt

where the constraint is R1,T =              x2 + yT . It is worth noting that the composite-noise
                                             T
                                                  2



{ηk }Nb are not Gaussian random variables or to say, not in normal distribution. But
     k=2


if {ηk }Nb are Gaussian then the ML algorithm for location estimation is equivalent
        k=2


to a weighted LS problem involving a constraint. As stated in Corollary 11-1 of [72],
89


ML, LS and BLUE (Best Linear Unbiased Estimator) algorithms are all equivalent

for a generic linear model with additive Gaussian noise term. By defining
                                                                                         
                                                                                            2
        a2                   b2                   x2      y2                 η2 −   σt   
                                                                                         
        .                    .                    .        .                    .        
   a =  . ;
        . 
                          b =  . ;
                               . 
                                            H1 = c22  .
                                                      .
                                                                .
                                                                .    ;
                                                                     
                                                                             η1 = 
                                                                                  
                                                                                       .
                                                                                       .        .
                                                                                                
                                                                                         
                                                                                            2
             aNb                  bNb                      xNb yNb                   ηNb − σt
                                                                                                (3.43)

we can rewrite (3.42) into a more compact quasi-linear form:


                                        a − bR1,T = H1 θ + η1 .                                 (3.44)


The above expression is very similar to a generic linear model of the standard LS al-

gorithm except that the pseudo-measurements vector a − bR1,T involves one unknown

R1,T =       x2 +2 . Fortunately we have an extra condition that helps to solve R1,T .
              T T


H1 is deterministic and η1 is a non-Gaussian vector but whose elements all have zero

                                                                 2
mean. Let W1 (R1,T ) be a diagonal matrix with elements E{|ηk − σt |2 }. Set

                          1                     T    −1
                   J1 =     a − bR1,T − H1 θ        W1 (R1,T ) a − bR1,T − H1 θ                 (3.45)
                          2

as the objective function to be minimized. Then it is well known that the minimizer

is the ML solution provided that the noise vector is Gaussian with W1 (R1,T ) as the

covariance matrix. The weighted LS solution can be easily obtained as

 ˆ
 θ = (HT W1 (R1,T )H1 )−1 HT W1 (R1,T ) a − bR1,T = Φ1 (R1,T ) a − bR1,T , (3.46)
       1
          −1
                           1
                              −1



                        −1                  −1              ˆ
where Φ1 (R1,T ) = (HT W1 (R1,T )H1 )−1 HT W1 (R1,T ). Here θ is an intermediate
                     1                   1


solution since R1,T is unknown. By taking norm square on both sides, it yields

                                   2                                 2
                                  R1,T = Φ1 (R1,T ) a − bR1,T            .                      (3.47)
90


If one of the roots from such a nonlinear equation is real and positive of which the one

yielding the smallest J1 is the optimal solution to the constrained LS problem. It is

commented that we convert the ML estimation problem into an LS-type estimation by

replacing the perfect TDOA information with measurement data and the equivalence

between the LS-type solution and ML estimator can be further established based on

the assumption that the composite noise vector is Gaussian. If the noise vector in

(3.40) is not exactly Gaussian, the constrained LS solution is not the ML solution

either. It seems that we overemphasized the simplicity that LS-type algorithm may

have and sacrificed the accuracy of estimation. However it is not too far away from

the true ML solution under some mild conditions as shown below.

   Let X be a Gaussian random variable with zero mean and variance σ 2 . Then the

high-order moments of X is given by [73]


           E{X 2n } = 1 × 3 × 5 × · · · × (2n − 1)σ 2n ;        E{X2n−1 } = 0.


where n > 0 an integer. Let Y = αX + (X 2 − σ 2 ). Then E{Y } = 0 and


      σY = E{Y 2 } = α2 σ 2 − σ 4 + E{X 4 } = α2 σ 2 + 2σ 4 = σ 2 (α2 + 2σ 2 ).
       2
                                                                                  (3.48)


Gaussian random variables (GRV) admit some nice properties that the summation

of any two GRV is still a GRV and the product of two independent GRV is a GRV

[73]. But we cannot conclude that Y is a GRV since it includes the X 2 term. We

are interested in under what condition Y is close to a GRV. By noting that Y =
91


(X + α/2)2 − (σ 2 + α2 /4), we have


                     X = −α/2 ±                  Y + (σ 2 + α2 /4),                        Y ≥ −(σ 2 + α2 /4).                             (3.49)


Since Y is a function of the GRV X, its PDF is thus given by
                                        √                      2                     √                    2
                                                                                                               
                           − 12      α−
                                      2
                                             y+(σ 2 +α2 /4)                 − 12   α+
                                                                                   2
                                                                                          y+(σ 2 +α2 /4)       
                 1          e2σ                                         e    2σ
   pY (y) = √                                                       +                                              , y ≥ −(σ 2 + α2 /4).    (3.50)
                2πσ 2          2     y + (σ 2 + α2 /4)                       2    y + (σ 2 + α2 /4)           

                                                           ∞
From PDF’s property, there holds                           −(σ 2 +α2 /4)           pY (y) = 1. Interestingly, the integral of

the first term in pY (y) is

                                                          1         α
                                                                        √                       2
                                                     −                −      y+(σ 2 +α2 /4)
                                                         2σ 2       2
                         1           ∞              e
     IY   = √                                                                                       dy
                        2πσ 2       −(σ 2 +α2 /4)       2 y + (σ 2 + α2 /4)
                                                             √                                  2
                   −1       ∞            −               1 α
                                                − y+(σ +α /4)   α   2   2
          =       √                     e 2σ2 2               d[ − y + (σ 2 + α2 /4)]
                    2πσ 2 −(σ 2 +α2 /4)                         2
                   −1       −∞       z2                   α
          =       √              e− 2σ2 dz     let : z = − y + (σ 2 + α2 /4)
                        2 α
                    2πσ 2                                 2
                   1      ∞     z2
                                ˜                       z
          =       √         e− 2σ2 d˜
                                    z      let : z = −
                                                  ˜
                    2π 2σ−α
                                                        σ
                           α
          =       1−Q           ,
                          2σ
                                           2
                        √1      ∞ − x2
where Q(x) =             2π     x e
                                    2σ dx           is the error function. Hence it is concluded that if

α/σ is sufficiently large, then IY ≈ 1 and thus pY (y) is dominated by the first term.

Intuitively, it can be seen that the second term (X 2 − σ 2 ) in Y will fade out since

its mean is zero and it has a small variance E{(X2 − σ 2 )2 } = 2σ 4 when α/σ is

sufficiently large. It is also easy to see that σY is dominated by α2 σ 2 based on the
                                               2



same assumption. Therefore, the random variable Y = αX + (X 2 − σ 2 ) behaves like

normal distributed, provided that α/σ is sufficiently large. Translating this result to
92

                                         ˜
the random variables as in (3.40) with δ tk = −δtk leads to


                           ˜       ˜k
                 Yk = αk δ tk + (δ t2 − σt ),
                                         2                        ˆ
                                                  αk = 2(R1,T + c∆tk,1 )/c.           (3.51)


Then η1 = [Y2 , Y3 , . . . , YNb ]T is a normally distributed vector, as δtk is Gaussian with

                        2
mean zero and variance σt . Thus Yk is close to Gaussian provided that αk /σt =

           ˆ
2(R1,T + c∆tk,1 )/(cσt ) is sufficiently large for all k ≥ 2. It is worth noting that

                                 αk 2                ˆ
                             (       ) = (R1,T /c + ∆tk,1 )2 /σt .
                                                               2
                                                                                      (3.52)
                                 2σt

The right-hand side of the above equation indicates an approximation to the SNR,

since its numerator represents the recorded signal of the traveling time from the target

                                     2
to the k-th BS and its denominator, σt , is the noise variance. If αk /σt is sufficiently

large, the variance of Yk is, by (3.48),


                      σYk = E{Yk2 } = αk σt + 2σt = σt (αk + 2σt )
                       2               2 2      4    2   2     2
                                                                                      (3.53)


                      2 2
that is dominated by αk σt . It is emphasized that αk = αk (R1,T ) is a function of R1,T .

Recall that one question is raised in the previous part that how far is the LS-type

solution obtained in (3.46) and (3.47) away from the true ML solution. Here a clear

answer is that the LS-type algorithm approximates to the ML solution well as long

as αk /σ is very large for 2 ≤ k ≤ Nb . Therefore the properties of the ML algorithm

hold approximately.

   Before ending this subsection, we would like to compute the Cram´r-Rao bound
                                                                   e

associated with the weighted LS solution by assuming that {Yk } are normal dis-
93


tributed which holds true approximately under the condition discussed earlier. Re-

call that W1 (R1,T ) in the weighted LS problem is the associated covariance matrix.

Thus E{Yk2 } is its element and the joint probability density function (PDF) for the

pseudo-measurement data {ak } and {bk } in (3.42) is
                              1                           1                     T    −1
    PDF =                                      exp    −     a − bR1,T − H1 θ        W1 (R1,T ) a − bR1,T − H1 θ             .        (3.54)
               (2π)n−1       det[W1 (R1,T )]              2



Note that inside the exponent is exactly J1 with a minus sign. The Fisher information

matrix for the PDF in (3.42) can be computed by using the Slepian-Bangs formula

in (3.32). Here we take the pseudo-measurement vector a as the data vector whose

                                                                        T
mean vector and convariance matrix are ma = bR1,T + H1 θ and Ma = E{η1 η1 }

respectively. Hence both mean and covariance are functions of (xT , yT ). By some

direct calculations, we have
                                                                                                                                             
              ∂                           2                                                ∂                           2
             ∂xT
                   (b2       x2 + yT +
                              T
                                   2
                                         c2
                                            (x2 xT   + y2 yT ))                           ∂yT
                                                                                                (b2      x2 + y T +
                                                                                                          T
                                                                                                                2
                                                                                                                      c2
                                                                                                                         (x2 xT   + y2 yT ))
          .                                                                         .                                                        
 ∂ ma
        = .                                                            ∂ ma
                                                                                    = .                                                        
 ∂xT      .                                                            ∂yT          .                                                        
               ∂                           2                                                ∂                           2
                 (bNb         x2 + yT +
                                    2                                                                     x2 + y T +
                                                                                                                 2
            ∂xT               T          2 (xNb xT + yNb yT ))
                                          c                                              ∂yT
                                                                                              (bNb         T          c2 (xNb xT + yNb yT ))
              2                                                                            2
                x + b2
             c2 2
                             cos(β1 )                                                        y + b2
                                                                                          c2 2
                                                                                                         sin(β1 )
          .                                                                         .                              
        = .                             ;                                         = .                              .
          .                                                                         .                              
              2                                                                            2
                x
             c2 Nb
                     + bNb cos(β1 )                                                          y
                                                                                          c2 Nb
                                                                                                   + bNb sin(β1 )
                                                                                                                                     (3.55)


And since Ma is a diagonal matrix whose k-th diagonal element is E{Yk2 } = σ2 (αk +
                                                                                2



2σ 2 ) with αk =         2                ˆ
                              x2 + yT + 2∆tk,1 , then taking the partial derivative of E{Yk2 } with
                               T
                                    2
                         c


respect to xT and yT gives

                                           2
                                         4σt αk                                                   2
                                                                                                4σt αk
                    ∂       2                                            ∂     2
                   ∂xT
                         E{Yk } =          c
                                                     cos(β1 );          ∂yT
                                                                            E{Yk }        =       c
                                                                                                         sin(β1 ).
94


It is then straightforward to show that
                                            2
                                         4σt cos(β1 )
                     ∂
                    xT
                         Ma = diag{            c
                                                      [α2 , α3 , . . . , αNb ]}
                     ∂                   4σ 2 sin(β )
                    yT
                         Ma =       diag{ t c 1 [α2 , α3 , . . . , αNb ]}                        (3.56)
                     M−1 = diag{ σ12 [ α2 +2σ2 , α2 +2σ2 , . . . , α2
                      a
                                          1         1                             1
                                                                                   2   ]}.
                                              t   2   t       3    t         Nb +2σt


Now we can calculate the Fisher information matrix via Slepian-Bangs method in

(3.32) as

                    1       ∂Ma −1 ∂Ma      ∂ ma T −1 ∂ ma 2,2
        Jtdoa,LS = [ tr{M−1      Ma     }+(     ) Ma (    )]       ,                             (3.57)
                    2    a
                             ∂χi    ∂χj     ∂χi        ∂χj i,j=1,1

where χ1 = xT and χ2 = yT . By substituting (3.55) and (3.56) into (3.57), the Fisher

information matrix is given by
                                                                                               T
               n                                      2                                  2
                                1        2xk /c + bk cos(β1 )   2xk /c + bk cos(β1 ) 
 Jtdoa,LS =             2   2       2                                                 
              k=2      σt (αk   + 2σt )   2yk /c2 + bk sin(β1 )    2yk /c2 + bk sin(β1 )
                                                                        
                            n
                                                           cos(β1 ) 
                                          2
                                        8αk
                       +               2     2                             cos(β1 ) sin(β1 )   (3.58)
                           k=2    c2 (αk + 2σt )2             sin(β1 )

The above expression is different from Jtdoa in (3.34) no matter how large αk /σt is

and how small σt is. Such a discrepancy is caused by the omission of the second

term in pY (y) in computing the Fisher information matrix. The omitted term in

pY (y) may have negligible value in computing the probability but its derivative can

be significant. Moreover no matter how small σt is, it can not be zero that contributes

to this discrepancy.

3.2.3       Location based on AOA

The angle of arrival (AOA) of MS signals at a BS can be obtained by antenna arrays.

Unlike TOA/TDOA based location methods, we do not need to consider timing syn-
95


chronization problems for an AOA based location algorithm. But there are something

in common with TOA/TDOA that we have to fuse either TOA/TDOA or AOA mea-

surements into the triangular relations between the BSs and the mobile users, i.e., the

                                                                    ˆ
target. Suppose that the AOA measurement data are to be of the form βk = βk + δβk .

Recall that tan(βk ) = (yT −yk )/(xT −xk ). That is, βk = βk (xT , yT ). We again assume

                                                                                   2
that {δβk } are uncorrelated with Gaussian distribution of mean zero and variance σβ .

Its joint PDF is given by
                                                                             
                                                 Nb
                                1                   1 ˆ                   2
              p∆β (δβ) =                exp −       2
                                                       βk − βk (xT , yT )  .           (3.59)
                                               k=1 2σβ
                                     Nb
                            (2π)Nb σ β


Since the AOA measurements are associated with additive Gaussian noise, it is easy

to compute the Fisher information matrix whose inverse matrix is the Cramer-Rao

bound for the covariance matrix of the estimation error. Simply speaking, the larger

the Fisher information matrix, the smaller the estimation error variance. And that

translates into a better estimator in terms of accuracy, provided that it is unbiased.

The Fisher information matrix contains the relative rate (derivative) at which the

probability density function changes with respect to the data. Note that the greater

the expectation of a change is at a give value, say (ˆT , yT ), the easier it is to distinguish
                                                     x ˆ

(ˆT , yT ) from neighboring values (locations), and hence the more precisely (xT , yT )
 x ˆ

can be estimated at (xT , yT )=(ˆT , yT ). To calcualte the Fisher information matrix,
                                x ˆ

we still have to use the Slepian-Bangs formula as in (3.32). First, some primary
96


computations are carried out as
     ∂βk (xT ,yT )                                            ∂βk (xT ,yT )
         ∂xT
                     =    ∂
                         ∂xT
                                   tan−1 ( xT −yk )
                                           y
                                            T −xk                 ∂yT
                                                                              =    ∂
                                                                                  ∂yT
                                                                                        tan−1 ( xT −yk )
                                                                                                y
                                                                                                 T −xk
                                                                                                                 (3.60)
                     = − yR2 k ;
                          T −y
                                                                              =   xT −xk
                                                                                     2
                                                                                   Rk,T
                                                                                         .
                                  k,T


And we know that the mean vector is mβ = [β1 , β2 , . . . , βNb ]T and the covariance ma-

trix is Mβ = INb . With these primary calculation and results, the Fisher information

matrix of AOA measurements is given by
                                                                    
                                                    yT − yk
             Nb
                                               −                              yT − yk               xT − xk
                              1                  Rk,T (xT , yT )    
  Jaoa =                                           xT − xk              −
                σ 2 R (xT , yT )2
             k=1 β k,T
                                                                              Rk,T (xT , yT )       Rk,T (xT , yT )
                                                 Rk,T (xT yT )
                                                           ,
             Nb
                              1              − sin(βk ) 
         =                                                      − sin(βk ) cos(βk )           .
                σ 2 R (xT , yT )2
             k=1 β k,T                            cos(βk )
                                                                                                                 (3.61)

With the information matrix above, we can calculate the Cramer-Rao bound (CRB)easily.

In terms of CRB, ML estimator is the closest one among all the unbiased estimators.

The ML algorithm is to minimize the likelihood function of the following form

                                                        Nb
                                                                                        2
                                   L∆β (xT , yT ) =            ˆ
                                                               βk − βk (xT , yT )           .                    (3.62)
                                                        k=1


Then the necessary condition for (x∗ , yT ) to be ML solution is
                                   T
                                        ∗

                                                                                             
                         Nb
                           1  sin(βk )  ˆ            ∗    ∗      0 
                                           βk − βk (xT , yT ) =    .                                         (3.63)
                      k=1 Rk,T   −cos(βk )                          0

No matter how many minimum points the nonlinear likelihood function may have,

the true ML solution (x∗ , yT ) must be one of them such that the partial derivative of
                       T
                            ∗



L∆β (xT , yT ) with respect to xT and yT at the location (x∗ , yT ) are zeros. Again this
                                                           T
                                                                ∗



is a difficult nonlinear optimization to solve and multiple solutions may exists. Thus

we turn our attention to the LS-type algorithm before solving the ML solution.
97

                                                 ˆ                        ˆ
   Recall that the AOA measurements are given by βk = βk + δβk , or δβk = βk − βk .

                                ˆ
Hence Rk,T sin(δβk ) = Rk,T sin(βk − βk ) and thus

                                        ˆ                        ˆ
              Rk,T sin(δβk ) = Rk,T sin(βk ) cos(βk ) − Rk,T cos(βk ) sin(βk )
                                                                                                  (3.64)
                                              ˆ              ˆ
                                    = ∆xk sin(βk ) − ∆yk cos(βk ).

where ∆xk = xT − xk , ∆yk = yT − yk , and Rk,T =                  ∆x2 + ∆yk . It follows that
                                                                    k
                                                                          2




               ˆ             ˆ              ˆ             ˆ
  ϕk = −xk sin(βk ) + yk cos(βk ) = −xT sin(βk ) + yT cos(βk ) + Rk,T sin(δβk ).                  (3.65)


We can regard ϕk as a pseudo-measurement constituting of the real measurements

     ˆ
data βk and the known BS location (xk , yk ). For the term Rk,T sin(δβk ) at the right

side of equation (3.65), we argue that even though {sin(δβk )} are not Gaussian, they

                                                              2
are close to Gaussian distributed provided that the variance σβ is adequately small

by the fact that with z = sin(δβ) [73],
                                                   2                                                2
             ∞     exp − 2σ2 sin−1 (z) + 2kπ
                          1
                                                       + exp − 2σ2 sin−1 (z) + (2k + 1)π
                                                                1
                            β                                          β
 pZ (z) =
                                                       2
            k=−∞                                    2πσβ (1   −   z2)
                                                                                                  (3.66)

                                               2
for |z| ≤ 1 and pZ (z) = 0 for |z| > 1. Since σβ is sufficiently small, there holds

                                                                                       1
                                1      1                      2            1      −          z2
                                                                                      2σ 2
                 pZ (z) ≈        exp − 2 sin−1 (z)                 ≈              e     β         (3.67)
                               2
                            2πσβ      2σβ                                     2
                                                                           2πσβ

for z ≈ 0. The above implies Rk,T sin(δβk ) will behave like a GRV under the condition

that δβk is very small. This can also be seen in an approximate way that sin (δβk ) ≈

δβk , if δβk is very small. Hence sin (δβk ) and δβk will almost have the same PDF.

We also would like to argue that the probability for |δβ| ≥ π/2 is zero generically.

Otherwise it would imply the wrong direction of the angle of arrival completely. Hence
98


the PDF of δβ has a shape similar to the normal function but tends to zero for |δβ| =

π/2 and beyond that implies that pZ (z) behaves closely to Gaussian distributed.

   Even if δβ is normal, the exact variance of sin(δβk ) can be computed as

                         1                  1 1                      1         2
E{sin2 (δβk )} =           E{1−cos(2δβk )} = − E{ej2δβk +e−j2δβk } =   1 − e−2σβ ≈ σβ
                                                                                    2
                         2                  2 4                      2
                                                                                                (3.68)

                   2
for the case when σβ is sufficiently small. Now the linear equations in (3.65) are of

the form
                                                                                  
                   ˆ                      ˆ
     ϕ1   − sin(β1 )               cos(β1 )              R1,T sin(δβ1 )           
                                                                                
         
     ϕ2          ˆ
            − sin(β2 )               cos(β2 )   xT 
                                          ˆ                
                                                             R2,T sin(δβ2 )
                                                                                       
                                                                                       
        =                                             +                          ,       (3.69)
     .        .                                                .                 
     .        .                                y                .                 
     .        .                                 T               .                 
                                                                                  
        ϕNb                    ˆ         ˆ
                         − sin(βNb ) cos(βNb )                     RNb ,T sin(δβNb )

The noise vector on the right hand side is denoted by

                                                                                       T
              η2 =        R1,T sin(δβ1 ) R2,T sin(δβ2 ) . . . RNb ,T sin(δβNb )             .


It has mean zero and covariance matrix W2 (R1,T ) that is diagonal with the k-th

element
                                  2
                  Rk,T 1 − e−2σβ /2 ≈ Rk,T σβ = [(xT − xk )2 + (yT − yk )2 ]σβ .
                   2                   2    2                                2
                                                                                                (3.70)

With the Gaussian assumption on the noise vector and {ϕk } as pseudo-measurements,

(3.69) has the form

                                             1            T    −1
        ϕ = H2 θ + η2         =⇒      J2 =     ϕ − H2 θ       W2 (R1,T ) ϕ − H2 θ               (3.71)
                                             2

is the objective function. Minimization of J2 corresponds to the ML algorithm. The
99


ML solution is given by

                     ˆ                          −1
                     θ = HT W2 (R1,T )H2
                          2
                             −1
                                                     HT W2 (R1,T )ϕ.
                                                      2
                                                         −1
                                                                                          (3.72)


         −1
However W2 (R1,T ) involves the unknown (xT , yT ) and R1,T =               x2 + yT , the above
                                                                             T
                                                                                  2



does not give the ML solution explicitly. It is interesting to notice that the weighted

LS problem in this subsection is again a constrained LS-type problem. Indeed by

noting that


  Rk,T = x2 + yk + x2 + yT − 2(xk xT + yk yT ) = Rk + RT − 2(xk xT + yk yT ),
   2
          k
               2
                    T
                         2                        2    2
                                                                                          (3.73)


we can multiply (3.72) from left by    xk y k        for k = 2, · · · , Nb to arrive at

                                                               −1
  ρk,T := xk xT + yk yT =   xk yk     HT W2 (R1,T )H2
                                       2
                                          −1
                                                                    HT W2 (R1,T ))ϕ.
                                                                     2
                                                                        −1
                                                                                          (3.74)


             2      2    2
In addition Rk,T = Rk + RT − 2ρk,T . Thus taking norm square on both sides of (3.72)

yields

                                                                    −1
  R1,T = Φ2 (R1,T )ϕ 2 ,
   2
                            Φ2 (R1,T ) = HT W2 (R1,T )H2
                                          2
                                             −1
                                                                         HT W2 (R1,T ). (3.75)
                                                                          2
                                                                             −1




Consequently we have Nb equations with Nb unknowns {Rk,T }Nb and R1,T . Although
                                                          k=2


these are nonlinear equations, they can be manipulated to solve at least one set of

solutions for these Nb unknowns. These solutions can be substituted back to (3.72)

to yield the ML solution (xT , yT ). It is commented that for large Nb , the complexity

for quasi-linear localization based on AOAs is much higher than the corresponding

localization based on TDOAs. But if we have additional information on TDOAs,
100


then the complexity can be reduced tremendously that will be studied in the next

subsection.

   Before ending this subsection we present the Fisher information matrix associated

with the LS-type problem as posed in (3.69). With the assumption on Gaussian

distribution for the noise vector η2 , the joint PDF has the expression

                        1                       1             T    −1
  PDF =                                 exp −     ϕ − H2 θ        W2 (R1,T ) ϕ − H2 θ       (3.76)
              (2π)n   det[W2 (R1,T )]           2

where ϕ can be regarded as pseudo-measurement vector. Hence H2 θ is the mean

vector and W2 (R1,T ) is the covariance matrix. An application of the Slepian and

Bangs formula gives the corresponding Fisher information matrix:
                                           
                        n
                         2  cos(βk ) 
        Jaoa,LS   =      2                      cos(βk ) sin(βk )                         (3.77)
                    k=1 Rk,T   sin(βk )
                                                                 
                                    n               ˆ
                                         1  − sin(βk ) 
                                +      2    2                             ˆ        ˆ
                                                                      − sin(βk ) cos(βk )
                                  k=1 Rk,T σβ       ˆ
                                                cos(βk )
                                                    
                        n             ˆ
                           1  − sin(βk ) 
                  ≈      2    2                              ˆ        ˆ
                                                         − sin(βk ) cos(βk )                (3.78)
                    k=1 Rk,T σβ       ˆ
                                  cos(βk )

                        2
where sufficiently small σβ is assumed. It is interesting to observe that the above

approximate expression is the same as Jaoa in (3.61) except that {βk } are replaced

    ˆ
by {βk }.

3.2.4       Location based on both TDOA and AOA

After discussing the location techniques based on either TDOA or AOA measurements

in the previous two sections, we now assume that both AOAs and TDOAs are available
101


for target localization. Though it indicates more information and data are needed and

consequently costs are increased for a location system, the improved accuracy may

pay off all the expense. Hence it is meaningful to study the location method based

on a combination of TDOA/AOA in the case of redundant information available and

high location resolution mandated. Assuming the independence of the noises (δtk and

δβk ) in measuring the TDOAs and AOAs, the joint PDF is
                                                                   2                                2    
                          Nb    ˆ
                               ∆tk,1 − ∆tk,1 (xT , yT )                     Nb    ˆ
                                                                                  βk − βk (xT , yT ) 
                    
                  exp−                                                 −
                                                2                                          2           
                        k=2                   2σt                           k=1         2σβ
  p∆ (δt, δβ) =                              √ N                    √                                          (3.79)
                                                             N −1                 N
                                                (2π)   b −1 σ b         (2π)Nb σβ b
                                                             t

              = p∆t (δt)p∆β (δβ).

Because of the independence between {δtk }Nb and {δβk }Nb , the Fisher information
                                          k=2          k=1


matrix has the expression

                                      Jtdoa/aoa = Jtdoa + Jaoa ,                                               (3.80)

where Jtdoa and Jaoa are the same as in (3.34) and (3.61), respectively. This can be

easily shown [74] by

                    ∂[ln(p∆ (δt,δβ))]       ∂[(ln p∆ (δt,δβ))] T
  Jtdoa/aoa = E           ∂x                       ∂x
                    ∂[ln(p∆t (δt))]       ∂[ln(p∆β (δβ))]      ∂[ln(p∆t (δt))]        ∂[ln(p∆β (δβ))] T
            =E            ∂x
                                      +         ∂x                   ∂x
                                                                                  +         ∂x
                    ∂[ln(p∆t (δt))]       ∂[(ln p∆t (δt))] T                 ∂[ln(p∆β (δβ))]    ∂[(ln p∆β (δβ))] T
            =E            ∂x                     ∂x
                                                                +E                 ∂x                  ∂x

            = Jtdoa + Jaoa .
                                                                                                               (3.81)
102


With respect to the joint PDF in (3.79), the corresponding likelihood-type function

in this case has the form

                  Nb                                                          Nb
                       1    ˆ
                                                                      2           1 ˆ                     2
 L(xT , yT ) =        2σ2
                          c∆tk,1 − Rk,T (xT , yT ) + R1,T                 +        2
                                                                                     βk − βk (xT , yT )       .
                 k=2 c t                                                      k=1 σβ

                                                                                                     (3.82)

The ML algorithm seeks the maximum of the above likelihood function. The necessary

condition for it to achieve maximum at (x∗ , yT ) is:
                                         T
                                              ∗

                                                       
         0               Nb   1        sin(βk ) 
                                                      ˆ
            =            k=1 Rk,T                βk − βk (x∗ , yT )
                                                               T
                                                                    ∗

          0                             −cos(βk )
                                                      
                                      xk − x∗T     x∗                                                (3.83)
                                              + T 
                             Nb        Rk,T      R1,T     ˆ
                       +     k=2     yk − yT∗
                                                  y ∗  c∆tk,1 − Rk,T + R1,T .
                                                      
                                               + T
                                        Rk,T     R1,T

The Newton-type algorithms can be applied to solve the ML solution. Clearly the ML

solution to the above nonlinear equations is hard to compute that may not be a global

maximum for L(xT , yT ). An alternative method is the use of LS-type algorithm as in

the previous two subsections. One possible way is to compute the constrained LS so-
         (TDOA)        (TDOA)              (AOA)       (AOA)
lutions (ˆT
         x         , yT
                     ˆ          ) and (ˆT
                                       x           , yT
                                                     ˆ         ) based on TDOAs and AOAs separately

as in the previous subsections and then combine the two as [53]

                  (AOA)                   (TDOA)                   (AOA)               (TDOA)
      xT = γxT
      ˆ                    + (1 − γ)ˆT
                                    x              ,      yT = γyT
                                                          ˆ                + (1 − γ)ˆT
                                                                                    y                (3.84)


where 0 < γ < 1. Note that Rk,T = R1,T + c∆tk,T can be used in (3.69) to avoid

computing Nb unknowns with Nb equations. Indeed the noise terms in (3.69) have
103


zero mean and variance


                                 ˆ                                       ˆ
E{Rk,T sin2 (βk )} = E{[R1,T + c∆tk,1 − cδtk ]2 sin2 (βk )} ≈ [(R1,T + c∆tk,1 )2 + c2 σt ]σβ
   2                                                                                   2 2



                                                                                             (3.85)

    2
if σβ is sufficiently small. Hence only one unknown RT is involved and ρk,T are all

eliminated which helps to simplify the computation of the LS-type solution to the

target localization problem based on measurements of AOAs. However the determi-

nation of the optimal value of γ is not easy. Hence we opt to compute the LS-type

solution directly.

   Since both AOAs and TDOAs are available, we have the following linear equations:
                                                                   
                          a − bR1,T   H1   xT   η1 
                                    =         +    .                                 (3.86)
                               ϕ         H2     yT     η2

Under the independence assumption for the noises η1 and η2 , we have
                                                                                      
     
      η1 
                                  
                                    η1                        
                                                               
                                                                  W1 (R1,T )       0      
   E    = 0,
          
                             W = E                   η1 η2  =                             
      η                                 η2                             0        W2 (R1,T )
           2
                                                                                              (3.87)

where the kth diagonal element of W2 (R1,T ) is the same as in (3.85). By assuming

uncorrelated Gaussian for η1 and η2 , the ML solution to estimation of (xT , yT ) can

be computed through minimization the following objective function:
                                                    T    −1
                         1
               J1,2 =    2
                             a − bR1,T − H1 θ           W1 (R1,T ) a − bR1,T − H1 θ
                                           T                                                 (3.88)
                           1                    −1
                         + 2 ϕ − H2 θ          W2 (R1,T ) ϕ − H2 θ = J1 + J2 .

Taking derivative of the cost function J1,2 with respect to θ, we have

               ∂J1,2
                       = −(a − bR1,T − H1 θ)T W1 H1 − (ϕ − H2 θ)T W2 H2 .
                                               −1                  −1
                                                                                             (3.89)
                ∂θ
104

                ∂J1,2
By letting       ∂θ
                        = 0, tt can be easily shown that the minimizer to the cost function

J1,2 is given by

         xT
         ˆ          −1                −1                −1       −1                        −1
              = HT W1 (R1,T )H1 + HT W2 (R1,T )H2
                 1                 2                         HT W1 (R1,T ) a − bR1,T + HT W2 (R1,T )ϕ .
                                                              1                         2
         yT
         ˆ
                                                                                                       (3.90)


Because the above solution involves an unknown R1,T =                        x2 + yT , we can again take
                                                                              T
                                                                                   2



norm square both sides to obtain an equation for R1,T first, and after computing its

solution, the value of R1,T can be substituted into (3.90) to obtain the solution to

the weighted LS problem. Note that R1,T is a positive real root to some nonlinear

equation. One of the positive real roots corresponds to the constrained LS solution,

which provides an initial guess for the true (nonlinear) ML solution.

     It is commented that the optimal solution in (3.90) is not in the form of the convex

combination of the two separate LS-type solutions as in (3.84). Rather it is in the

form                                                                            
                                               (TDOA)                        (AOA)
                            xT 
                             ˆ              xT
                                             ˆ                           xT
                                                                           ˆ         
                                = Γ                   + (I − Γ)                              (3.91)
                                              (TDOA)                        (AOA)
                               yT
                               ˆ             yT
                                             ˆ                             yT
                                                                           ˆ
where Γ is a matrix. Specifically the solution in (3.90) can be written as
                                                                                                           
                                                                    (TDOA)                          (AOA)
 xT 
  ˆ                       −1                                 −1  xT
                                                                  ˆ                         −1  xT
                                                                                                  ˆ             
          = [A1 + A2 ]       [B1 + B2 ] = I + A−1 A2
                                                 1             
                                                                                     −1
                                                                             + I + A2 A1                      
                                                                    (TDOA)                          (AOA)
    yT
    ˆ                                                              yT
                                                                   ˆ                               yT
                                                                                                   ˆ
                                                                                                   (3.92)

where
                                             −1
                                    A1 = HT W1 (R1,T )H1 ;
                                          1
                                             −1
                                    A2 = HT W2 (R1,T )1, T 2 ;
                                          2
                                             −1
                                    B1 = HT W1 (R1,T ) a − bR1,T ;
                                          1
                                             −1
                                    B2 = HT W2 (R1,T )ϕ.
                                          2
105


Hence A−1 B1 and A−1 B2 are the LS-type solution based on TDOAs and AOAs,
       1          2


respectively. Now it is straightforward to show that

              −1                       −1
 I + A−1 A2
      1            + I + A−1 A1
                          2                 = [A1 + A2 ]−1 A1 + [A1 + A2 ]−1 A2 = Γ + [I − Γ] = I.

                                                                                                 (3.93)

Even though the LS solution in (3.90) is some kind of combination of the two separate

LS solutions in (3.46) and (3.72), the unknown R1,T has to be computed based on

(3.90).

     Finally the Fisher information matrix associated with the linear model in (3.86)

is
                                                                                                       T
                        Nb                                    2                          2
                                          1        2xk /c + bk cos(β1 )   2xk /c + bk cos(β1 ) 
 Ptdoa/aoa−f im,LS =              2   2       2                                                 
                       k=2       σt (αk   + 2σt )   2yk /c2 + bk sin(β1 )    2yk /c2 + bk sin(β1 )
                                                                        
                                 Nb
                                                           cos(β1 ) 
                                                2
                                              8αk
                             +               2     2                       cos(β1 ) sin(β1 )
                                 k=2    c2 (αk + 2σt )2       sin(β1 )
                                                             
                                 Nb              ˆ
                                      1  − sin(βk ) 
                             +      2    2                            ˆ        ˆ
                                                                  − sin(βk ) cos(βk )
                               k=1 Rk,T σβ       ˆ
                                             cos(βk )
                                                                                                 (3.94)

                                                                 2      2
under the uncorrelated Gaussian assumption and sufficiently small σt and σβ .


3.3       Constrained Least-square Optimization
                                           ˆ
As shown in 3.46, the weighted LS solution θ is constrained by


                                       R1,T = Φ1 [a − bR1,T ] 2 ,
                                        2               2
                                                                                                 (3.95)


from which some solutions R1,T can be solved. If there exist real solutions R1,T , they

can be substituted back into J1 in 3.45 and obtain the optimal solution R1,T based on
106

                           ˆ
which the optimal solution θ can be obtained. While this holds, (3.95) may not admit

a real solution R1,T due to the existence of noises in observations. More specifically

(3.95) is equivalent to the quadratic equation


                 (bT ΦT Φb − 1)R1,T − 2aT ΦT ΦbR1,T + aT ΦT Φa = 0,
                                2
                                                                                           (3.96)


which admit real solution, if and only if


                 (aT ΦT Φb)2 + aT ΦT Φa − (aT ΦT Φa)(bT ΦT Φb) ≥ 0.                        (3.97)


That is, (3.95) admits a real solution R1,T if and only if (3.97) holds. Simulation

in [54] shows that the location estimate in (3.46) is very accurate if the condition

(3.97) holds; Otherwise the location estimate is far away from the true location. The

question is what if (3.97) does not hold which is generically true due to the existence

of noise in the TDOA and AOA measurements.

   Let us examine (3.45) again by rewriting J1 into
                                          T                                   
            1                    pT         −1                       pT       
     J1 =     a −   H1 b                   W1 a −        H1 b                  ,   (3.98)
            2                        R1,T                                    R1,T

where pT = [ xT yT ]T . The nonlinear estimation problem aims to search pT and
             ˆ ˆ

R1,T such that J1 is minimized, subject to the constraint R1,T =                     pT . Denote

Σ = W1 and
                                                                                 
                                                 pT         −I2 0 
             A=      H1 b        , ϕ = a, θ = 
                                                    , Q =        .
                                                R1,T           0 1
Then we have the following more general constrained LS optimization problem:

                                     1
                       min J1 , J1 = 2 (Aθ − ϕ)T Σ−1 (Aθ − ϕ).                             (3.99)
                      θ T Qθ=0
107


We will develop a solution algorithm to such a constrained LS optimization problem

in the following. Assume that Σ is positive definite, A has full column rank and Q

is nonsingular that has both positive and negative eigenvalues, i.e., Q is indefinite.

We employ Lagrange multiplier to develop the solution algorithm. Let λ be real and

consider
                           1
                      J=     (Aθ − ϕ)T Σ−1 (Aθ − ϕ) + λθT Qθ .                      (3.100)
                           2

Then the necessary condition for optimality yields the condition


     AT Σ−1 [Aθ − ϕ] + λQθ = 0 ⇔ θ = [AT Σ−1 A + λQ]−1 AT Σ−1 ϕ.                    (3.101)


An optimal solution needs to satisfy the constraint θT Qθ = 0 leading to


        ϕT Σ−1 A[AT Σ−1 A + λQ]−1 Q[AT Σ−1 A + λQ]−1 AT Σ−1 ϕ = 0.                  (3.102)


The solution algorithm hinges to the computation of the real root λ from the above

equation and there can be more than one such real root. We employ the result of

simultaneous diagonalization. Because Σ = ΣT > 0 and Q = QT > 0, there exists a

nonsingular matrix S such that AT Σ−1 A = SDΣ ST and Q = SDQ ST where DΣ and

DQ are both diagonal. It is noted that DΣ and DQ have the same inertia as Σ and

Q, respectively. It follows that (3.102) is equivalent to


  (S−1 AT Σ−1 ϕ)T (λI + DΣ D−1 )−1 D−1 (λI + DΣ D−1 )−1 (S−1 AT Σ−1 ϕ) = 0.
                            Q       Q            Q                                  (3.103)


Let D−1 = diag(q1 , q2 , . . . , ql ) with l×l the size of Q. Then it has the same number of
     Q


negative and positive elements as D = DΣ D−1 = diag(d1 , d2 , . . . , dl ) by the positivity
                                          Q
108


of Σ and DΣ . In fact, qi di > 0. The matrices S and D can be obtained by eigenvalue

decomposition of AT S−1 AQ−1 = SDS−1 . Let vi be the i-th element of S−1 AT Σ−1 ϕ.

Then (3.103) is converted into the following:
                                                                             l        2
  −1   T    −1   T                                                                qi vi
(S A Σ ϕ)            (λI+DΣ D−1 )−1 D−1 (λI+DΣ D−1 )−1 (S−1 AT Σ−1 ϕ)
                             Q       Q          Q                       =                2
                                                                                           = 0.
                                                                            i=1 (λ + di )

                                                                                     (3.104)

We comment that the above has real roots by examining the summation at λ ≈ −di

and by the fact that {qi } have both positive and negative values but not zero. Recall

the assumption on Q. However there are only finitely many real λ values satisfying

(3.104), which are denoted by {λk }. Now by (3.101),

           Aθ − ϕ = [A(AT Σ−1 A + λk Q)−1 AT Σ−1 − I]ϕ
                      =    (AQ)−1 (λk I + AT Σ−1 (AQ)−1 )−1 AT Σ−1 − I ϕ
                      =    (λk I + AQ−1 AT Σ−1 )−1 AQ)−1 AT Σ−1 − I ϕ                (3.105)
                      = −λk (λk I + AQ−1 AT Σ−1 )−1 ϕ
                      = −λk Σ(λk Σ + AQ−1 AT )−1 ϕ.
Substituting the above into the performance index J in (3.100) leads to

                 2J = λ2 ϕT (λk Σ + AQ−1 AT )−1 Σ(λk Σ + AQ−1 AT )−1 ϕ.
                       k                                                             (3.106)

Let λopt be the value that minimizes J over {λk }. Then in light of (3.101), the optimal
     k


θ is obtained as

                            θopt = [AT Σ−1 A + λopt Q]−1 AT Σ−1 ϕ.
                                                k                                    (3.107)

To facilitate the MATLAB programming in simulation for roots computation we can

convert (3.104) to
                                    l
                                             2
                                         qi vi     (λ + dk )2 = 0.                   (3.108)
                                   i=1           k=i
109


Obviously the solution algorithm above is developed for location estimation with

TDOA measurements available only. If both TDOA and AOA measurements are

collected, as discussed in the previous section, the extra redundancy indicates an

improved accuracy. According to (3.86), we can formulate it into a similar constrained

LS optimization problem. Denote
                                                                              
       W1                  H1 b          pT             a           −I2 0 
Σ=                ; A =             ;θ =           ;ϕ =        ;Q =            .
              W2                H2 0             R1,T             φ             0     1
                                                                                    (3.109)

Then we can use the same Lagrange multiplier method to give a solution.
110


3.4     Simulations

In this section, we present a set of simulation results that demonstrate the perfor-

mance of our proposed estimation algorithm.

   In the simulation, there are nine base stations which are equally spaced around

a circle. In real WiMax system, the base stations may not exactly locate on a cir-

cle. This is simply for ease of presentation and it is not necessarily required in our

algorithm which is applicable to any geographical distribution of any number of base

stations. To test the accuracy of our location method, ten positions for the mobile

user are chosen and they are distributed around a smaller circle too. For the same

purpose of an easy demonstration, the above assumption about the MS route is made.

The configuration is shown in Figure 3.5.

                                      4
                                   x 10
                              1
                                                                  BS4
                             0.8
                                          BS5                                            BS3
                             0.6
                                                        MS4
                                                                           MS3
                             0.4
                                                MS5                              MS2
                             0.2
             y: in meters




                                   BS6                                             MS1     BS2
                              0           MS6
                                                                  BS1
                            −0.2
                                                MS7
                                                                                 MS10
                            −0.4
                                                      MS8               MS9
                            −0.6
                                          BS7                                            BS9
                            −0.8
                                                                   BS8
                             −1
                              −1            −0.5                  0               0.5             1
                                                            x: in meters                   x 10
                                                                                                 4




                             Figure 3.5: Base stations and mobile user locations
111


   The base stations are at BS1 = [0, 0]T , BS2 = [32000, 0]T , BS3 = [22627, 22627]T ,

BS4 = [0, 32000]T , BS5 = [−22627, 22627]T , BS6 = [−32000, 0]T ,

BS7 = [−22627, −22627]T , BS8 = [0, −32000]T , BS9 = [22627, −22627]T . The unit

is in meters. For each MS position, total number of 2000 different data sets are run

and the MS location is obtained by averaging over all the 2000 estimates.

   In the experiments, our location algorithm is simulated for TDOA data only and

for a combination of AOA and TDOA data, respectively. In Figure 3.6, the green

line is the result from a combination of AOA and TDOA data when the SNR’s are

SNRtdoa = 20dB and SNRaoa = 20dB respectively. It almost merges with the blue

line which represents the real MS positions and is invisible in the figure. It shows the

high accuracy of the estimation algorithm we propose in this thesis. With the same

SNRtdoa = 20dB, the cyan line is the estimation result from the TDOA data only. It

can be seen that there is small deviation from the real position. Intuitively, with the

extra information from AOA measurement, the result in the green line is expected to

be closer to the real positions. From the Fisher information matrices we calculated

in the previous sections, the Cramer-Rao bound for the combination data of TDOA

and AOA should be smaller than that of TDOA data only.

   To have a closer look at the performance of the proposed algorithm, we calculate

the approximate mean and standard deviation of the estimation error, i.e., the dis-

tance between the real MS position and the estimated position. It is obtained from a

sample space of 2000 data points. In Figure 3.7, the average estimation error is less
112

                                      4
                                   x 10
                              1
                                                                      BS
                             0.8                                      Known
                                                                      TDOA
                             0.6                                      AOA+TDOA

                             0.4

                             0.2
             y: in meters




                              0

                            −0.2

                            −0.4

                            −0.6

                            −0.8

                             −1
                              −1          −0.5         0        0.5                1
                                                 x: in meters               x 10
                                                                                 4




     Figure 3.6: Location estimation with TDOA-only and AOA+TDOA data

than 4 meters for all the ten MS locations when the TDOA data is of high SNR. To

study the effect of SNR on the performance of the proposed location algorithm, the

MS position at MS2 is randomly selected and the mean and the standard deviation

of the estimation error vary with SNR as shown in Figure 3.8. It is easily seen that at

a low SNR, the estimation is not accurate enough and it is because our assumption

about the measurement noise variance is not valid.

   According to the FCC regulations, it requires that for 67% of the E911 calls,

the wireless service providers must provide an estimated location with location error

below 100m. As shown in Figure 3.9, the location error is below 100m for 98% of the

time with SNRtdoa = 40dB. It is well above the requirement from FCC.

   From the above figures, it is demonstrated that the proposed algorithm can provide
113



                                                                                              TDOA
                                                     4.5
                                                                                                                                mean
                                                                                                                                std



           mean and standard deviation (in meters)
                                                       4




                                                     3.5




                                                       3




                                                     2.5




                                                       2
                                                           1      2     3          4         5       6       7             8    9      10
                                                                                 mobile station positions (no unit)



                                                               Figure 3.7: Location estimation performance


                                                                                              TDOA
                                                     450
                                                                                                                                mean
                                                     400                                                                        std
mean and standard deviation (in meters)




                                                     350

                                                     300

                                                     250

                                                     200

                                                     150

                                                     100

                                                      50

                                                       0
                                                        20         25       30        35      40       45             50       55      60
                                                                                           SNR (in dB)



                                                      Figure 3.8: Effect of SNR on estimation accuracy
114


                                                         AOA+TDOA
                              100

                               90

                               80

                               70
                                                                               FCC
                                                                               Requirement
              1−Outrage (%)

                               60

                               50

                               40

                               30

                               20

                               10

                                0
                                    0        20        40             60      80        100
                                                     Location error (meter)



                                    Figure 3.9: Outrage curve for location accuracy

accurate estimation for the MS location. It also meets the FCC requirement for out-

door network-based wireless location.


3.5     Chapter Summary

In this chapter, an introduction about WiMax networks and its IEEE standard evolu-

tion and applications in most aspects is given and the outdoor/indoor wireless location

technologies based on measurements of TOA’s, TDOA’s, AOA’s and amplitudes are

reviewed.

   With measurements of TDOA and AOA available, we present a constrained LS-

type algorithm to estimate the target location. The proposed method is different

from the commonly used ML algorithm, though the latter is heavily preferred in
115


some applications for its superior performance. Because of the large number of obser-

vation data and the additive measurement noise, maximizing the likelihood function

involves a great amount of computational load. It even does not guarantee that the

optimal estimation can be obtained due to the existence of local minimum. Under

the assumption of zero-mean additive Gaussian noise with a very small variance, the

location estimation problem is formulated into a quasi-linear form, which is solvable

by the LS algorithm. The assumption is usually validated as in [54]. Therefore,

our method holds the preferable properties of the ML algorithm in the sense that it

approaches the Cramer-Rao bound with a large sample of observation data. More

importantly, the computational complexity is reduced by the LS algorithm. As shown
                                                                  ˆ
in this chapter, the LS algorithm also involves a constraint that θ = R1,T . The

target location can only be obtained by substituting the intermediate LS solution

into the constraint and solving the resultant quadratic equation. It brings complexity

back to the solution. Hence the Lagrange multiplier is explored to solve the above

constrained LS optimization problem. The simulation results show that our scheme

is effective in location estimation.
Chapter 4

Conclusions

This dissertation, in the first part, addresses the problem of channel estimation of

MIMO-OFDM systems. It starts from the matrix representation of the signal model

of MIMO-OFDM systems, which clearly describes the relation of signals in frequency

domain and time domain and expressing operations like adding CP and removing

CP as matrix product. From the resulting MIMO-OFDM signal model, a pilot tone

based channel estimation is proposed to estimate the fast time-varying and frequency-

selective fading channel via the least-squares method. The least-squares is selected

for the purpose of low complexity, though some other methods such as MMSE and

ML may produce better estimation performance. To further reduce the computa-

tional complexity, the pilot tone matrix is designed as a unitary matrix to save the

computation of the matrix inversion in the standard LS solution. The pilot tone

matrix is designed in a simple way that Nt disjoint pilot tone sets are placed at one

OFDM block on each transmit antenna. Each pilot tone set has L pilot tones which

are equally-spaced and equally-powered. By choosing the pilot tones based on our de-



                                        116
117


sign, those pilot tones comprise a unitary matrix. For a simple 2 × 2 case, Alamouti’s

orthogonal structure is exploited. And the design can be readily extended to a config-

urable MIMO-OFDM system with any number of transmit and receive antennas. For

a fixed power of pilot tones, our design can be proved to be also optimal in the sense

of achieving the minimum MSE of channel estimation. Compared with some relative

pilot tone designs in the literature, our channel estimation method differs in its ability

to estimate fast time-varying wireless channel since pilot tones are inserted into each

OFDM block, and in its explicit relation with space-frequency code design which can

benefit the channel estimation in return. Seeking for a robust channel estimator with

lower complexity for MIMO-OFDM systems, we are looking at the following aspects

in the future.


   • Less overhead loss: It is worth noting that the use of pilot symbols for channel

      estimation decrease the spectrum efficiency of the wireless communication sys-

      tems. It is a trade off between data throughput and estimation accuracy. It is

      of interest to investigate a scheme with even fewer number of pilot tones in each

      OFDM block by exploiting some statistical properties of the wireless channel

      itself. Intuitively, it is the best balance between overhead loss and estimation

      reliability if we can adaptively change the number of pilot tones depending on

      the channel condition through some feedback information.


   • Joint channel estimation and CFO correction: Usually when we design the chan-

      nel estimator, we assume that the OFDM system is perfectly synchronized and
118


     there is carrier frequency offset at all. And some CFO compensation algorithms

     are also based on the assumption that channel is known at the receiver. It would

     be beneficial to combine the channel estimation and CFO compensation into an

     integrate algorithm since the performance of either one of the two individual al-

     gorithms can be degrade by the invalidity of their assumptions in the real world

     OFDM systems. There are already some research work in this area [34, 35], but

     more intensive investigation is still needed.


   But we still have to consider the data rate loss caused by the pilot-tone overhead

within each OFDM block. We are currently working on this issue with a goal that we

can use a sequence of pilot-tones with length less than the channel length by exploring

its diversity in the time domain.

   In the second part of this dissertation, the wireless location on WiMax network

is studied. Similar to the location technology applied to the cellular networks, the

application scenario of locating the mobile user by using some signal parameters

received at the antenna towers is considered. Location estimation methods based on

TDOA, AOA and a combination of TDOA and AOA are presented, respectively. With

the assumption that the measurement noise is zero-mean additive Gaussian noise with

very small variance, the location estimation problem is formulated into a quasi-linear

form. Then the simple LS algorithm can be used to solve the estimation problem,

provided that the noise term in the quasi-linear form is Gaussian. In theory, the ML

algorithm can be directly utilized to estimate the target location since the probability
119


density function of the observation data is known with our assumption. However,

direct use of ML algorithm proves infeasible because of the difficulty of finding the

real roots of a quadratic equation. An alternative to the ML algorithm is required,

which should drastically reduce the complexity of the ML algorithm and provide a

close performance. Our proposed method is such an alternative that it is essentially

a constrained LS-type optimization technique. The approximation of the noise term

in the quasi-linear form to a Gaussian random is also proved in this thesis under the

assumption above. Hence it is concluded that the proposed method can estimate the

target location very accurately, provided that the size of the observation data is large

enough and the equivalent SNR is high. To solve the constrained LS-type optimization

problem, the Lagrange multiplier method is used. It is because that the direct use of

the constraint condition may lead to the same level of complexity for the algorithm

and even positive real roots may not exist in the quadratic equation obtained by

substituting the intermediate LS solution into the constraint. Finally,the extensive

simulation studies has demonstrated the effectiveness of our proposed algorithm.

   For future work on wireless location problem, the following aspects are open for

research.


   • Large variance: The approximation of the constrained LS-type optimization to

     the ML algorithm is dependent on the assumption that the measurement noise

     variance is very small, which is usually true. Further research on the case of
120


  measurement noise with relatively large variance will improve the robustness of

  the proposed algorithm.


• Velocity Estimation: In the thesis, the target is considered stationary by assum-

  ing it is moving at a low speed. If the FDOA (frequency difference of arrivals) of

  the received signal is available, then the velocity of the target can be estimated

  too. This will extend the range of applications of the proposed algorithm.
Bibliography

 [1] Richard Van Nee and Ramjee Prasad, OFDM For Wireless Multimedia Commu-
     nications, Artech House Publishers, Norwood MA, 2000.
 [2] R. W. Chang, “Synthesis of band-limited orthogonal signals for multichannel
     data,” BSTJ., pp. 1775-1797, Dec. 1996.
 [3] B. R. Saltzburg, “Performance of an efficient parallel data transmission systems,”
     IEEE Trans. on Comm. Tech., pp. 805-811, Dec. 1967.
 [4] S. B. Weinstein and P. M. Ebet, “Data transmission by frequency-division multi-
     plexing using the discrete Fourier transform,” IEEE Trans. on Commun., COM-
     19(5), pp. 628-634, Oct. 1971.
 [5] L.J. Cimini, Jr., “Analysis and simulation of a digital mobile channel using or-
     thogonal frequency division multiplexing,” IEEE Trans. on Communications.,
     vol. 33, pp. 665-675, July 1985.
 [6] A. Peled and A. Ruiz, “Frequency domain data transmission usng reduced com-
     putational complexity algorithms,” In Proc. IEEE ICASSP, pp. 964-967, Denver,
     CO, 1980.
 [7] A. Vahlin and N. Holte, “Optimal finite duration pulses for OFDM,” IEEE Trans.
     Commun., 44(1), pp. 10-14, Jan. 1996.
 [8] B. Le Floch, M. Alard and C. Berrou, “Coded orthogonal frequency-division
     multiplexing,” Proc. IEEE, 83(6), pp. 982-996, Jun. 1995.
 [9] T. Pollet, M. Van Bladel and M. Moeneclaey, “BER sensitivity of OFDM systems
     to carrier frequency offset and Wiener phase noise,” IEEE Trans. on Comm., Vol.
     43, No. 2/3/4, pp. 191-193, Feb.-Apr., 1995.
[10] P. H. Moose, “A technique for orthogonal frequency division multiplexing fre-
     quency offset correction,” IEEE Trans. on Comm., Vol. 42, No. 10, pp. 2908-
     2914, Oct., 1994.
[11] T. M. Schmidl and D. C. Cox, “Robust frequency and timing synchronization
     for OFDM,” IEEE Trans. on Comm., Vol. 45, No. 12, pp. 1613-1621, Dec., 1997.

                                        121
122


[12] Van Nee and R. D. J., “OFDM codes for peak-to-average power reduction and er-
     ror correction,” IEEE Global Telecommunications Conference, London, pp. 740-
     744, Nov., 1996.

[13] J. A. Davis and J. Jedwab, “Peak-to-average power control in OFDM, Golay
     complementary sequences and Reed-Muller codes,” HP Laboratories Technical
     Report, HPL-97-158, Dec., 1997.

[14] A. Tarighat and A. H. Sayed, “MIMO OFDM receivers for systems with IQ
     imbalances,” IEEE Transactions on Signal Processing, vol. 53, no. 9, pp. 3583-
     3596, Sep. 2005.

[15] A. Tarighat, R. Bagheri, and A. H. Sayed, “Compensation schemes and perfor-
     mance analysis of IQ imbalances in OFDM receivers,” IEEE Transactions on
     Signal Processing, vol. 53, no. 8, pp. 3257-3268, Aug. 2005.

[16] S. Alamouti, “A simple transmit diversity technique for wireless communica-
     tions,” IEEE J. Select. Areas Communication, vol. 16, pp. 1451-1458, Oct., 1998.

[17] G. J. Foschini, “Layered space-time architecture for wireless communication in
     a fading environment when using multi-element antennas,” Bell Labs. Tech. J.,
     pp. 41-59, Autumn, 1996.

[18] V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time block codes from
     orthogonal designs,” IEEE Trans. Inform. Theory, vol. 45, pp. 1456-1467, July
     1999.

[19] V. Tarokh, N. Seshadri, and A. R. Calderbank, “Space-time codes for high data
     rate wireless communications: Performance criterion and code construction,”
     IEEE Trans. Inform. Theory, vol. 44, pp. 744-765, March 1998.

[20] T. L. Marzetta and B. M. Hochwald, “Capacity of a mobile multiple-antenna
     communication link in Rayleigh flat fading,” IEEE Trans. Inform. Theory, vol.
     45, pp. 139-157, Jan. 1999.

[21] G. J. Foschini and M. J. Gans, “On limits of wireless communications in a fading
     evvironments when using multiple antennas,” Wireless Pers. Commun., vol. 6,
     no. 3, pp. 311-335, Mar. 1998.

[22] E. Telatar, “Capacity of multi-antenna Gaussian channels,” Euro. Trans. Co-
     mun., vol. 10, no. 6, pp. 585-595, Nov.-Dec. 1999.

[23] A. Wittneben, “A new bandwidth efficient transmit antenna modulation diver-
     sity scheme for linear digital modulation,” Proc. ICC, pp. 1630-1634, 1993.
123


[24] Jan Mietzner and Peter A. Hoeher, “Boosting the performance of wireless com-
     munication systems: theory and practice of multiple-antenna techniques,” IEEE
     Communicatin Magazine, no. 10, pp. 40-47, Oct. 2004.

[25] T. M. Marzetta and B. M. Hochwald, “Capacity of a mobile multiple-antenna
     communication link in Rayleigh flat fading ,” IEEE Trans. Inform. Theory, vol.
     45, no. 1, pp. 139-157, 1999.

[26] L. Zheng and D. N. C. Tse, “Communication on the Grassmann manifold: A
     geometric approach to the noncoherent multiple-antenna channel ,” IEEE Trans.
     Inform. Theory, vol. 48, no. 2, pp. 359-383, Feb. 2002.

[27] I. Barhumi, G. Leus and M. Moonen, “Optimal training design for MIMO OFDM
     systems in mobile wireless channels,” IEEE Trans. Signal Processing, vol. 51, No.
     6, pp. 1615-1624, Jun. 2003.

[28] Allert van Zelst and Tim C. W. Schenk, “Implementation of a MIMO OFDM-
     based Wireless LAN system,” IEEE Trans. Signal Processing, vol. 52, No. 2, pp.
     483-494, Feb. 2004.

[29] X. Li, H. Huang G. J. Foschini and R. A. Valenzuela, “Effects of iterative detec-
     tion and decoding on the performance of BLAST,” IEEE Proc. Global Telecom-
     mun. Conf., vol. 2, No. 2, pp. 1061-1066, 2000.

[30] A. Salvekar, S. Sandhu, Q. Li, M. Vuong and X. Qian, “Multiple-Antenna Tech-
     nology in WiMax Systems,” Intel Technology Journal, vol. 8, No. 3, [online]:
     http://www.intel.com/technology/itj/2004/volume08issue03, Aug. 2004.

[31] Hongwei Yang, “A road to future broadband wireless access: MIMO-OFDM-
     Based air interface,” IEEE Communications Magazine, Vol. 43, No. 1, pp. 53 -
     60, Jan. 2005.

[32] H. B¨lcskei, M. Borgmann and A. J. Paulraj, “Impact of the propagation envi-
          o
     ronments on the performance of space-frequency coded MIMO-OFDM,” IEEE
     J. Select. Areas Commun., vol. 21, No. 3, pp. 427-439, Apr. 2003.

[33] H. B¨lcskei, and A. J. Paulraj, “Space-frequency coded broadband OFDM sys-
         o
     tems,” Proc. IEEE WCNC, pp. 1-6, Chicago, IL, Sep. 2000.

[34] X. Ma, H. Kobayashi and S. C. Schwartz, “Joint frequency offset and chanel
     estimation for OFDM,” Proc. of Global Telecommun. Conf., pp. 15-19, Dec.
     2003.

[35] P. Stoica and O. Besson, “Training sequence design for frequency offset and
     frequency-selective channel estimation,” IEEE Trans. on Commun., vol. 51, No.
     11, pp. 1910-1917, Nov. 2003.
124


[36] Nima Khajehnouri and Ali H. Sayed, “Adaptive angle of arrival estimation for
     multiuser wireless location systems,” Fifth IEEE Workshop on Signal Processing
     Advances in Wireless Communications, Lisboa, Portugal, July 11-14, 2004.

[37] Part 11: Wireless LAN Medium Access Control (MAC) and Pyhsical Layer
     (PHY) Specifications—Amendment 1: High-speed Phyisical Layer in the 5 GHz
     Band, IEEE Standard 802.11a-1999.

[38] M.    Brookers,     “Matrix   Reference  Manual          [online]”,   available:
     http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/.

[39] Part 11: Wireless LAN Medium Access Control (MAC) and Pyhsical Layer
     (PHY) Specifications—Amendment 1: High-speed Phyisical Layer in the 5 GHz
     Band, IEEE Standard 802.11a-1999.

[40] Part 16: Air Interface for Fixed Broadband Wireless Access Systems—
     Amendment 2: Medium Access Control Modifications and Additional Pyhsical
     Layer Specifications for 2-11 Ghz, IEEE Standard 802.16a-2003.

[41] Digital broadcasting systems for television, sound and data services. European
     Telcommunications Standard, prETS 300 744 (Draft, version 0.0.3), Apr. 1996.

[42] H. Sampath, S. Talwar, J. Tellado, V. Erceg and A. Paulraj, “A fourth-generation
     MIMO-OFDM broadband wireless system: design, performance and field trial
     results,” IEEE Communications Magazine, No. 9, pp. 143-149, Sep., 2002.

[43] Justin Chuang and Nelson Sollenberger, “Beyond 3G: Wideband wireless data
     access based on OFDM and dynamic packet assignment,” IEEE Communications
     Magazine, No. 7, pp. 78-87, Jul., 2000.

[44] Z. Liu, G. Giannakis, S. Barbarosa, and A. Scaglione, “Transmit-antennae space-
     time block coding for generalized OFDM in the presence of unknown multipath,”
     IEEE J. Select. Areas Communication, vol. 19, no. 7, pp. 1352-1364, Jul. 2001.

[45] S. Yatawatta and A. P. Petropulu, “Blind channel estimation in MIMO OFDM
     systems,”      IEEE       Trans.      Signal     Processing,      submitted,
     http://www.ece.drexel.edu/CSPL/publications/ssp03sa
     -rod.pdf

[46] H. B¨lcskei, R. W. Heath Jr. and A. Paulraj, “Blind channel identification and
          o
     equalization in OFDM-based multiantenna systems,” IEEE Trans. Signal Pro-
     cessing, vol. 50, No. 1, pp. 96-109, Jan. 2002.

[47] Y. Li, N. Seshadri and S. Ariyavisitakul, “Channel estimation for OFDM systems
     with transmitter diversity in mobile wireless channels,” IEEE J. Select. Areas
     Communication, vol. 17, pp. 461-471, March 1999.
125


[48] Y. Li, “Simplified channel estimation for OFDM systems with multiple transmit
     antennas,” IEEE Trans. Wireless Communications, vol. 1, No. 1, pp. 67-75, Jan.
     2002.

[49] R. Negi and J. Cioffi, “Pilot tone selection for channel estimation in a mobile
     OFDM system,” IEEE Trans. Cosumer Electronics, vol. 44, No. 3, pp. 1122-1128,
     August 1998.

[50] G. L. St¨ber, J. R. Barry, S. W. Mclaughlin, Y. Li, M. A. Ingram and T. G.
             u
     Pratt, “Broadband MIMO-OFDM wireless communications,” Proceedings of the
     IEEE, vol. 92, No. 2, pp. 271-294, Feb. 2004.

[51] W. C. Jakes, Microwave Mobile Communications, John Wiley and Sons, New
     York, 1974.

[52] R. O. Schmidt, “Multiple emitter location and signal parameter estimation”, in
     Proc. RADC, Spectral Estimation Workshop, Rome, NY, pp. 243-258.

[53] A. H. Sayed, A. Tarighat, and N. Khajehnouri, “Network-based wireless loca-
     tion,” IEEE Signal Processing Magazine, vol. 22, no. 4, pp. 24-40, July 2005.

[54] K. C. Ho and Wenwei Xu, “An accurate algebraic solution for moving source lo-
     cation using TDOA and FDOA measurements,” IEEE Trans. Signal Processing,
     vol. 52, no. 9, pp. 2453-2463, Sep. 2004.

[55] “Wireless location technologies        and    service   [online],”   available:
     http://www.3gamericas.org/English/

[56] PELORUS Group. Report on wireless location-based markets. Technical Report,
     2001

[57] In-Stat/MDR. Location-based services: Finding their place in the market . Tech-
     nical Report, Feb. 2003

[58] A. H. Sayed and N. R. Yousef, Wireless location. Wiley Encyclopedia of Telecom-
     munications, J. Proakis, editor, John Wiley & Sons, NY, 2003

[59] FCC Docket No. 94-102. Revision of the commissions rules to issue compatability
     with enhanced 911 emergency calling systems. Technical Report RM-8143, July
     1996.

[60] State of New Jersey. Report on the New Jersey wireless enhanced 911 terms:
     The first 100 days. Technical Report, Jun. 1997

[61] M. Yunos, J. Zeyu Gao and S. Shim, Wireless advertising’s challenges and op-
     portunities. IEEE Computer Magazine, vol. 36, No. 5, pp. 30-37, May, 2003
126


[62] Telecommunications Industry Association. The CDMA2000 ITU-R RTT Candi-
     date Submission V0.18, Jul. 1998.

[63] J. J. Caffery and G. L. Stuber, “Overview of radiolocation in CDMA cellular
     systems,” IEEE Communications Magazine, vol. 36, No. 4, pp. 38-45, Apr. 98.

[64] H. Krim and M. Viberg, “Two decades of array signal processing research: Te
     parametric approach,” IEEE Signal Processing Magazine, vol. 13, No. 4, pp.
     67-94, Jul. 1996.

[65] T. Ojanpera and R. Rrasad, Wideband CDMA for third generation mobile com-
     munications. Arech House, Boston, MA 1998.

[66] R. Rrasad, W. Mohr and W. Konhauser, Third generation mobile communica-
     tions. Arech House, Boston, MA 2000.

[67] P. Bahl and V. N. Padmanabhan, “Radar: an in-building RF-based user location
     and tracking system,” Proc. IEEE Conference INFOCOMM, Vol. 2, pp. 775-784,
     Tel Aviv, March 2000.

[68] T. Ross, P. Myllymaki and H. Tirri, “A statistical modeling approach to location
     estimation,” IEEE Trans. On Mobile Computing, Vol. 1, No. 1, pp. 59-69, Jan.
     2002.

[69] M. Youssef, A. Agrawala and A. U. Shankar, “WLAN location determination via
     clustering and probability distributions,” Proc. IEEE Conference PerCom, pp.
     143-150, March 2003.

[70] G. H. Golub and C. F. Van Loan, “Matrix Computations”, 2nd Edition, Balti-
     more: The Johns Hopkins University Press, 1989.

[71] John G. Proakis, “Digital Communications”, 4th Edition, Prentice Hall, New
     Jersey, 2000

[72] Jerry M. Mendel, “Lessons in estimation theory for signal processing, commu-
     nications and control,” 2nd Edition, Prentice Hall PTR, Englewood Cliffs, New
     Jersey, March 1995.

[73] Athanasios Papoulis and S. Unnikrishna Pillai, “Probability , Random Variables
     and Stochastic Processes,” 4h Edition, McGraw-Hill, Dec. 2001.

[74] P. Stoica, and R. Moses, “Introduction to Spectral Analysis.” Upper Saddle
     River, NJ: Prentice Hall, 1997.
Vita

Zhongshan Wu was born in Anhui, China, on December 4, 1974. He received his

bachelor of science degree in electrical engineering from Northeastern University in

July 1996. In spring 2000, he entered the graduate program in the Department of

Electrical and Computer Engineering at Louisiana State University. He got his master

of science degree in electrical engineering in December 2001. Now he is a candidate

for the degree of doctor of philosophy in electrical engineering.




                                          127

More Related Content

PDF
Emotions prediction for augmented EEG signals using VAE and Convolutional Neu...
PDF
Di11 1
PDF
Thesis yossie
PDF
Queueing
PDF
From sound to grammar: theory, representations and a computational model
PDF
Free high-school-science-texts-physics
PDF
PhD-2013-Arnaud
PDF
NP problems
Emotions prediction for augmented EEG signals using VAE and Convolutional Neu...
Di11 1
Thesis yossie
Queueing
From sound to grammar: theory, representations and a computational model
Free high-school-science-texts-physics
PhD-2013-Arnaud
NP problems

What's hot (17)

PDF
Crowell benjamin-newtonian-physics-1
PDF
A buffer overflow study attacks and defenses (2002)
PDF
Efficient algorithms for sorting and synchronization
PDF
Ns doc
PDF
Coupled thermal fluid analysis with flowpath-cavity interaction in a gas turb...
PDF
Nvidia cuda programming_guide_0.8.2
PDF
10.1.1.652.4894
PDF
Lecture notes on hybrid systems
PDF
Discrete Mathematics - Mathematics For Computer Science
PDF
Introduction to objectual philosophy
PDF
Thesis
PDF
Introduction to Programming Using Java v. 7 - David J Eck - Inglês
PDF
Master thesis xavier pererz sala
PDF
jmaruski_1
PDF
Mansour_Rami_20166_MASc_thesis
PDF
MSC-2013-12
PDF
PhD_Thesis_J_R_Richards
Crowell benjamin-newtonian-physics-1
A buffer overflow study attacks and defenses (2002)
Efficient algorithms for sorting and synchronization
Ns doc
Coupled thermal fluid analysis with flowpath-cavity interaction in a gas turb...
Nvidia cuda programming_guide_0.8.2
10.1.1.652.4894
Lecture notes on hybrid systems
Discrete Mathematics - Mathematics For Computer Science
Introduction to objectual philosophy
Thesis
Introduction to Programming Using Java v. 7 - David J Eck - Inglês
Master thesis xavier pererz sala
jmaruski_1
Mansour_Rami_20166_MASc_thesis
MSC-2013-12
PhD_Thesis_J_R_Richards
Ad

Viewers also liked (8)

PDF
Dissertation wonchae kim
DOCX
IEEE 2014 - 2015 COMMUNICATION TITLES
PDF
ESTIMATION OF CHANNEL IN OFDM WIRELESS CHANNEL USING LS AND MMSE TECHNIQUES
PPTX
PPT
PDF
Introduction to OFDM
DOCX
Writing chapter 3
PDF
Simulation of Wireless Communication Systems
Dissertation wonchae kim
IEEE 2014 - 2015 COMMUNICATION TITLES
ESTIMATION OF CHANNEL IN OFDM WIRELESS CHANNEL USING LS AND MMSE TECHNIQUES
Introduction to OFDM
Writing chapter 3
Simulation of Wireless Communication Systems
Ad

Similar to Wu dis (20)

PDF
MIMO-OFDM communication systems_ channel estimation and wireless.pdf
PDF
02 whole
PDF
Location In Wsn
PDF
Ofdm And Mccdma A Primer L Hanzo Dr T Kellerauth
PDF
Robust link adaptation in HSPA Evolved
PDF
PhD thesis - Decision feedback equalization and channel estimation for SC-FDMA
PDF
Evaluation of tdoa techniques for position
PDF
Mimo ofdm-n-chapter-1-2-3-4-5
PDF
Hub location models in public transport planning
PDF
Wireless Communications Andrea Goldsmith, Stanford University.pdf
PDF
Communication
PDF
OFDM Based Cognitive radio
PDF
ImplementationOFDMFPGA
PDF
Antenna study and design for ultra wideband communications apps
PDF
An_Introduction_WSNS_V1.8.pdf
PDF
phd thesis
PDF
Implementation of a Localization System for Sensor Networks-berkley
PDF
disertation_Pavel_Prochazka_A1
PDF
New Directions In Wireless Communications Research 2009th Edition Vahid Tarokh
PDF
Masters Thesis: A reuse repository with automated synonym support and cluster...
MIMO-OFDM communication systems_ channel estimation and wireless.pdf
02 whole
Location In Wsn
Ofdm And Mccdma A Primer L Hanzo Dr T Kellerauth
Robust link adaptation in HSPA Evolved
PhD thesis - Decision feedback equalization and channel estimation for SC-FDMA
Evaluation of tdoa techniques for position
Mimo ofdm-n-chapter-1-2-3-4-5
Hub location models in public transport planning
Wireless Communications Andrea Goldsmith, Stanford University.pdf
Communication
OFDM Based Cognitive radio
ImplementationOFDMFPGA
Antenna study and design for ultra wideband communications apps
An_Introduction_WSNS_V1.8.pdf
phd thesis
Implementation of a Localization System for Sensor Networks-berkley
disertation_Pavel_Prochazka_A1
New Directions In Wireless Communications Research 2009th Edition Vahid Tarokh
Masters Thesis: A reuse repository with automated synonym support and cluster...

Recently uploaded (20)

PDF
Computing-Curriculum for Schools in Ghana
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Yogi Goddess Pres Conference Studio Updates
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Trump Administration's workforce development strategy
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Computing-Curriculum for Schools in Ghana
STATICS OF THE RIGID BODIES Hibbelers.pdf
O7-L3 Supply Chain Operations - ICLT Program
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Supply Chain Operations Speaking Notes -ICLT Program
Anesthesia in Laparoscopic Surgery in India
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
Microbial disease of the cardiovascular and lymphatic systems
2.FourierTransform-ShortQuestionswithAnswers.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Final Presentation General Medicine 03-08-2024.pptx
Yogi Goddess Pres Conference Studio Updates
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Trump Administration's workforce development strategy
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
human mycosis Human fungal infections are called human mycosis..pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf

Wu dis

  • 1. MIMO-OFDM COMMUNICATION SYSTEMS: CHANNEL ESTIMATION AND WIRELESS LOCATION A Dissertation Submitted to the Graduate Faculty of the Louisiana State University and Agricultural and Mechanical College in partial fulfillment of the requirements for the degree of Doctor of Philosophy in The Department of Electrical and Computer Engineering by Zhongshan Wu B.S., Northeastern University, China, 1996 M.S., Louisiana State University, US, 2001 May 2006
  • 3. Acknowledgments Throughout my six years at LSU, I have many people to thank for helping to make my experience here both enriching and rewarding. First and foremost, I wish to thank my advisor and committee chair, Dr. Guoxiang Gu. I am grateful to Dr. Gu for his offering me such an invaluable chance to study here, for his being a constant source of research ideas, insightful discussions and inspiring words in times of needs and for his unique attitude of being strict with academic research which will shape my career forever. My heartful appreciation also goes to Dr. Kemin Zhou whose breadth of knowledge and perspectiveness have instilled in me great interest in bridging theoretical research and practical implementation. I would like to thank Dr. Shuangqing Wei for his fresh talks in his seminar and his generous sharing research resource with us. I am deeply indebted to Dr. John M. Tyler for his taking his time to serve as my graduate committee member and his sincere encouragement. For providing me with the mathematical knowledge and skills imperative to the work in this dissertation, I would like to thank my minor professor, Dr. Peter Wolenski for his precious time. For all my EE friends, Jianqiang He, Bin Fu, Nike Liu, Xiaobo Li, Rachinayani iii
  • 4. Kumar Phalguna and Shuguang Hao, I cherish all the wonderful time we have to- gether. Through it all, I owe the greatest debt to my parents and my sisters. Especially my father, he will be living in my memory for endless time. Zhongshan Wu October, 2005 iv
  • 5. Contents Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Notation and Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.1 OFDM System Model . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Dissertation Contributions . . . . . . . . . . . . . . . . . . . . . . . . 24 1.3 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . 27 2 MIMO-OFDM Channel Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.2 System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.2.1 Signal Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.2.2 Preliminary Analysis . . . . . . . . . . . . . . . . . . . . . . . 40 2.3 Channel Estimation and Pilot-tone Design . . . . . . . . . . . . . . . 46 2.3.1 LS Channel Estimation . . . . . . . . . . . . . . . . . . . . . . 46 2.3.2 Pilot-tone Design . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.3.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . 53 2.4 An Illustrative Example and Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.4.1 Comparison With Known Result . . . . . . . . . . . . . . . . 54 2.4.2 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . 59 v
  • 6. 3 Wireless Location for OFDM-based Systems . . . . . . . . . . . . . . . . . . . . . . 62 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.1.1 Overview of WiMax . . . . . . . . . . . . . . . . . . . . . . . 62 3.1.2 Overview to Wireless Location System . . . . . . . . . . . . . 65 3.1.3 Review of Data Fusion Methods . . . . . . . . . . . . . . . . . 70 3.2 Least-square Location based on TDOA/AOA Estimates . . . . . . . . 78 3.2.1 Mathematical Preparations . . . . . . . . . . . . . . . . . . . 78 3.2.2 Location based on TDOA . . . . . . . . . . . . . . . . . . . . 83 3.2.3 Location based on AOA . . . . . . . . . . . . . . . . . . . . . 94 3.2.4 Location based on both TDOA and AOA . . . . . . . . . . . . 100 3.3 Constrained Least-square Optimization . . . . . . . . . . . . . . . . . 105 3.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 3.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 vi
  • 7. List of Figures 1.1 Comparison between conventional FDM and OFDM . . . . . . . . . . 7 1.2 Graphical interpretation of OFDM concept . . . . . . . . . . . . . . . 9 1.3 Spectra of (a) an OFDM subchannel (b) an OFDM symbol . . . . . . 10 1.4 Preliminary concept of DFT . . . . . . . . . . . . . . . . . . . . . . . 11 1.5 Block diagram of a baseband OFDM transceiver . . . . . . . . . . . . 13 1.6 (a) Concept of CP; (b) OFDM symbol with cyclic extension . . . . . 16 2.1 Nt × Nr MIMO-OFDM System model . . . . . . . . . . . . . . . . . 34 2.2 The concept of pilot-based channel estimation . . . . . . . . . . . . . 43 2.3 Pilot placement with Nt = Nr = 2 . . . . . . . . . . . . . . . . . . . . 52 2.4 Symbol error rate versus SNR with Doppler shift=5 Hz . . . . . . . . 56 2.5 Symbol error rate versus SNR with Doppler shift=40 Hz . . . . . . . 57 2.6 Symbol error rate versus SNR with Doppler shift=200 Hz . . . . . . . 57 2.7 Normalized MSE of channel estimation based on optimal pilot-tone design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.8 Normalized MSE of channel estimation based on preamble design . . 58 3.1 Network-based wireless location technology (outdoor environments) . 67 vii
  • 8. 3.2 TOA/TDOA data fusion using three BSs . . . . . . . . . . . . . . . . 70 3.3 AOA data fusion with two BSs . . . . . . . . . . . . . . . . . . . . . 74 3.4 Magnitude-based data fusion in WLAN networks . . . . . . . . . . . 77 3.5 Base stations and mobile user locations . . . . . . . . . . . . . . . . . 110 3.6 Location estimation with TDOA-only and AOA+TDOA data . . . . 112 3.7 Location estimation performance . . . . . . . . . . . . . . . . . . . . 113 3.8 Effect of SNR on estimation accuracy . . . . . . . . . . . . . . . . . . 113 3.9 Outrage curve for location accuracy . . . . . . . . . . . . . . . . . . . 114 viii
  • 9. Notation and Symbols AM×N : M-row N-column matrix A−1 : Inverse of A Tr(A): Trace of A, Tr(A) = i Aii AT : Transpose of A A∗ : Complex conjugate transpose of A IN : Identity matrix of size N × N ix
  • 10. List of Acronyms MIMO multiple input and multiple outut OFDM orthogonal frequency division multiplexing LS least square MS mobile station TDOA time difference of arrival AOA angle of arrival WiMax worldwide interoperability for microwave access ML maximum-likelihood AWGN additive white Gaussian noise WMAN wireless metropolitan area network ICI inter-carrier interference ISI inter-symbol interference FFT fast Fourier transform WLAN wireless local area network CP cyclic prefix BER bit error rate MMSE minimum mean squared error GPS global positioning system WiFi wireless fidelity x
  • 11. Abstract In this new information age, high data rate and strong reliability features our wire- less communication systems and is becoming the dominant factor for a successful deployment of commercial networks. MIMO-OFDM (multiple input multiple output- orthogonal frequency division multiplexing), a new wireless broadband technology, has gained great popularity for its capability of high rate transmission and its robust- ness against multi-path fading and other channel impairments. A major challenge to MIMO-OFDM systems is how to obtain the channel state in- formation accurately and promptly for coherent detection of information symbols and channel synchronization. In the first part, this dissertation formulates the channel estimation problem for MIMO-OFDM systems and proposes a pilot-tone based esti- mation algorithm. A complex equivalent baseband MIMO-OFDM signal model is pre- sented by matrix representation. By choosing L equally-spaced and equally-powered pilot tones from N sub-carriers in one OFDM symbol, a down-sampled version of the original signal model is obtained. Furthermore, this signal model is transformed into a linear form solvable for the LS (least-square) estimation algorithm. Based on the resultant model, a simple pilot-tone design is proposed in the form of a unitary xi
  • 12. matrix, whose rows stand for different pilot-tone sets in the frequency domain and whose columns represent distinct transmit antennas in the spatial domain. From the analysis and synthesis of the pilot-tone design in this dissertation, our estimation algorithm can reduce the computational complexity inherited in MIMO systems by the fact that the pilot-tone matrix is essentially a unitary matrix, and is proven an optimal channel estimator in the sense of achieving the minimum MSE (mean squared error) of channel estimation for a fixed power of pilot tones. In the second part, this dissertation addresses the wireless location problem in WiMax (worldwide interoperability for microwave access) networks, which is mainly based on the MIMO-OFDM technology. From the measurement data of TDOA (time difference of arrival), AOA (angle of arrival) or a combination of those two, a quasi- linear form is formulated for an LS-type solution. It is assumed that the observation data is corrupted by a zero-mean AWGN (additive white Gaussian noise) with a very small variance. Under this assumption, the noise term in the quasi-liner form is proved to hold a normal distribution approximately. Hence the ML (maximum-likelihood) estimation and the LS-type solution are equivalent. But the ML estimation technique is not feasible here due to its computational complexity and the possible nonexistence of the optimal solution. Our proposed method is capable of estimating the MS loca- tion very accurately with a much less amount of computations. A final result of the MS (mobile station) location estimation, however, cannot be obtained directly from the LS-type solution without bringing in another independent constraint. To solve xii
  • 13. this problem, the Lagrange multiplier is explored to find the optimal solution to the constrained LS-type optimization problem. xiii
  • 14. Chapter 1 Introduction Wireless technologies have evolved remarkably since Guglielmo Marconi first demon- strated radio’s ability to provide continuous contact with ships sailing in the English channel in 1897. New theories and applications of wireless technologies have been developed by hundreds and thousands of scientists and engineers through the world ever since. Wireless communications can be regarded as the most important devel- opment that has an extremely wide range of applications from TV remote control and cordless phones to cellular phones and satellite-based TV systems. It changed people’s life style in every aspect. Especially during the last decade, the mobile radio communications industry has grown by an exponentially increasing rate, fueled by the digital and RF (radio frequency) circuits design, fabrication and integration tech- niques and more computing power in chips. This trend will continue with an even greater pace in the near future. The advances and developments in the technique field have partially helped to realize our dreams on fast and reliable communicating “any time any where”. But we 1
  • 15. 2 are expecting to have more experience in this wireless world such as wireless Internet surfing and interactive multimedia messaging so on. One natural question is: how can we put high-rate data streams over radio links to satisfy our needs? New wireless broadband access techniques are anticipated to answer this question. For example, the coming 3G (third generation) cellular technology can provide us with up to 2Mbps (bits per second) data service. But that still does not meet the data rate required by multimedia media communications like HDTV (high-definition television) and video conference. Recently MIMO-OFDM systems have gained considerable attentions from the leading industry companies and the active academic community [28, 30, 42, 50]. A collection of problems including channel measurements and modeling, channel es- timation, synchronization, IQ (in phase-quadrature)imbalance and PAPR (peak-to- average power ratio) have been widely studied by researchers [48, 11, 14, 15, 13]. Clearly all the performance improvement and capacity increase are based on accurate channel state information. Channel estimation plays a significant role for MIMO- OFDM systems. For this reason, it is the first part of my dissertation to work on channel estimation of MIMO-OFDM systems. The maturing of MIMO-OFDM technology will lead it to a much wider variety of applications. WMAN (wireless metropolitan area network) has adopted this technol- ogy. Similar to current network-based wireless location technique [53], we consider the wireless location problem on the WiMax network, which is based on MIMO-OFDM technology. The work in this area contributes to the second part of my dissertation.
  • 16. 3 1.1 Overview OFDM [5] is becoming a very popular multi-carrier modulation technique for trans- mission of signals over wireless channels. It converts a frequency-selective fading channel into a collection of parallel flat fading subchannels, which greatly simpli- fies the structure of the receiver. The time domain waveform of the subcarriers are orthogonal (subchannel and subcarrier will be used interchangeably hereinafter), yet the signal spectral corresponding to different subcarriers overlap in frequency domain. Hence, the available bandwidth is utilized very efficiently in OFDM systems without causing the ICI (inter-carrier interference). By combining multiple low-data-rate sub- carriers, OFDM systems can provide a composite high-data-rate with a long symbol duration. That helps to eliminate the ISI (inter-symbol interference), which often occurs along with signals of a short symbol duration in a multipath channel. Simply speaking, we can list its pros and cons as follows [31]. Advantage of OFDM systems are: • High spectral efficiency; • Simple implementation by FFT (fast Fourier transform); • Low receiver complexity; • Robustability for high-data-rate transmission over multipath fading channel • High flexibility in terms of link adaptation;
  • 17. 4 • Low complexity multiple access schemes such as orthogonal frequency division multiple access. Disadvantages of OFDM systems are: • Sensitive to frequency offsets, timing errors and phase noise; • Relatively higher peak-to-average power ratio compared to single carrier system, which tends to reduce the power efficiency of the RF amplifier. 1.1.1 OFDM System Model The OFDM technology is widely used in two types of working environments, i.e., a wired environment and a wireless environment. When used to transmit signals through wires like twisted wire pairs and coaxial cables, it is usually called as DMT (digital multi-tone). For instance, DMT is the core technology for all the xDSL (digital subscriber lines) systems which provide high-speed data service via existing telephone networks. However, in a wireless environment such as radio broadcasting system and WLAN (wireless local area network), it is referred to as OFDM. Since we aim at performance enhancement for wireless communication systems, we use the term OFDM throughout this thesis. Furthermore, we only use the term MIMO-OFDM while explicitly addressing the OFDM systems combined with multiple antennas at both ends of a wireless link. The history of OFDM can all the way date back to the mid 1960s, when Chang [2] published a paper on the synthesis of bandlimited orthogonal signals for multichannel
  • 18. 5 data transmission. He presented a new principle of transmitting signals simultane- ously over a bandlimited channel without the ICI and the ISI. Right after Chang’s publication of his paper, Saltzburg [3] demonstrated the performance of the efficient parallel data transmission systems in 1967, where he concluded that “the strategy of designing an efficient parallel system should concentrate on reducing crosstalk be- tween adjacent channels than on perfecting the individual channels themselves”. His conclusion has been proven far-sighted today in the digital baseband signal processing to battle the ICI. Through the developments of OFDM technology, there are two remarkable con- tributions to OFDM which transform the original “analog” multicarrier system to to- day’s digitally implemented OFDM. The use of DFT (discrete Fourier transform) to perform baseband modulation and demodulation was the first milestone when Wein- stein and Ebert [4] published their paper in 1971. Their method eliminated the banks of subcarrier oscillators and coherent demodulators required by frequency-division multiplexing and hence reduced the cost of OFDM systems. Moreover, DFT-based frequency-division multiplexing can be completely implemented in digital baseband, not by bandpass filtering, for highly efficient processing. FFT, a fast algorithm for computing DFT, can further reduce the number of arithmetic operations from N 2 to N logN (N is FFT size). Recent advances in VLSI (very large scale integration) technology has made high-speed, large-size FFT chips commercially available. In We- instein’s paper [4], they used a guard interval between consecutive symbols and the
  • 19. 6 raised-cosine windowing in the time-domain to combat the ISI and the ICI. But their system could not keep perfect orthogonality between subcarriers over a time disper- sive channel. This problem was first tackled by Peled and Ruiz [6] in 1980 with the introduction of CP (cyclic prefix) or cyclic extension. They creatively filled the empty guard interval with a cyclic extension of the OFDM symbol. If the length of CP is longer than the impulse response of the channel, the ISI can be eliminated completely. Furthermore, this effectively simulates a channel performing cyclic convolution which implies orthogonality between subcarriers over a time dispersive channel. Though this introduces an energy loss proportional to the length of CP when the CP part in the received signal is removed, the zero ICI generally pays the loss. And it is the second major contribution to OFDM systems. With OFDM systems getting more popular applications, the requirements for a better performance is becoming higher. Hence more research efforts are poured into the investigation of OFDM systems. Pulse shaping [7, 8], at an interference point view, is beneficial for OFDM systems since the spectrum of an OFDM signal can be shaped to be more well-localized in frequency; Synchronization [9, 10, 11] in time domain and in frequency domain renders OFDM systems robust against timing errors, phase noise, sampling frequency errors and carrier frequency offsets; For coherent detection, channel estimation [46, 49, 48] provides accurate channel state information to enhance performance of OFDM systems; Various effective techniques are exploited to reduce the relatively high PAPR [12, 13] such as clipping and peak windowing.
  • 20. 7 The principle of OFDM is to divide a single high-data-rate stream into a number of lower rate streams that are transmitted simultaneously over some narrower subchan- nels. Hence it is not only a modulation (frequency modulation) technique, but also a multiplexing (frequency-division multiplexing) technique. Before we mathemati- cally describe the transmitter-channel-receiver structure of OFDM systems, a couple of graphical intuitions will make it much easier to understand how OFDM works. OFDM starts with the “O”, i.e., orthogonal. That orthogonality differs OFDM from conventional FDM (frequency-division multiplexing) and is the source where all the advantages of OFDM come from. The difference between OFDM and conventional FDM is illustrated in Figure 1.1. Ch1 Ch2 Ch3 Ch4 Ch5 Power (a) Frequency Ch1 Ch2 Ch3 Ch4 Ch5 Saving of bandwidth Power (b) Frequency Figure 1.1: Comparison between conventional FDM and OFDM It can be seen from Figure 1.1, in order to implement the conventional parallel data transmission by FDM, a guard band must be introduced between the different
  • 21. 8 carriers to eliminate the interchannel interference. This leads to an inefficient use of the rare and expensive spectrum resource. Hence it stimulated the searching for an FDM scheme with overlapping multicarrier modulation in the mid of 1960s. To realize the overlapping multicarrier technique, however we need to get rid of the ICI, which means that we need perfect orthogonality between the different modulated carriers. The word “orthogonality” implies that there is a precise mathematical re- lationship between the frequencies of the individual subcarriers in the system. In OFDM systems, assume that the OFDM symbol period is Tsym , then the minimum subcarrier spacing is 1/Tsym . By this strict mathematical constraint, the integration of the product of the received signal and any one of the subcarriers fsub over one symbol period Tsym will extract that subcarrier fsub only, because the integration of the product of fsub and any other subcarriers over Tsym results zero. That indicates no ICI in the OFDM system while achieving almost 50% bandwidth savings. In the sense of multiplexing, we refer to Figure 1.2 to illustrate the concept of OFDM. Ev- ery Tsym seconds, a total of N complex-valued numbers Sk from different QAM/PSK (quadrature and amplitude modulation/phase shift keying) constellation points are used to modulate N different complex carriers centered at frequency fk , 1 ≤ k ≤ N . The composite signal is obtained by summing up all the N modulated carriers. It is worth noting that OFDM achieves frequency-division multiplexing by base- band processing rather than by bandpass filtering. Indeed, as shown in Figure 1.3, the individual spectra has sinc shape. Even though they are not bandlimited, each
  • 22. 9 j 2 f 1t e s1(t) S1 j 2 f 2t e s2(t) S2 e j2 fNt sN(t) SN OFDM symbol: Figure 1.2: Graphical interpretation of OFDM concept
  • 23. 10 subcarrier can still be separated from the others since orthogonality guarantees that the interfering sincs have nulls at the frequency where the sinc of interest has a peak. 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 -0.2 -0.2 -0.4 -0.4 -10 -8 -6 -4 -2 0 2 4 6 8 10 -10 -5 0 5 10 (a) (b) Figure 1.3: Spectra of (a) an OFDM subchannel (b) an OFDM symbol The use of IDFT (inverse discrete Fourier transform), instead of local oscillators, was an important breakthrough in the history of OFDM. It is an imperative part for OFDM system today. It transforms the data from frequency domain to time domain. Figure 1.4 shows the preliminary concept of DFT used in an OFDM system. When the DFT of a time domain signal is computed, the frequency domain results are a function of the sampling period T and the number of sample points N . The funda- 1 mental frequency of the DFT is equal to NT (1/total sample time). Each frequency represented in the DFT is an integer multiple of the fundamental frequency. The maximum frequency that can be represented by a time domain signal sampled at rate 1 1 T is fmax = 2T as given by the Nyquist sampling theorem. This frequency is located in the center of the DFT points. The IDFT performs exactly the opposite operation to the DFT. It takes a signal defined by frequency components and converts them to a time domain signal. The time duration of the IDFT time signal is equal to N T . In
  • 24. 11 essence, IDFT and DFT is a reversable pair. It is not necessary to require that IDFT be used in the transmitter side. It is perfectly valid to use DFT at transmitter and then to use IDFT at receiver side. s(t) T t sample period NT S(f) 0 1/NT 2/NT 2/T (N-1)/NT f Figure 1.4: Preliminary concept of DFT After the graphical description of the basic principles of OFDM such as orthogo- nality, frequency modulation and multiplexing and use of DFT in baseband process- ing, it is a time to look in more details at the signals flowing between the blocks of an OFDM system and their mathematical relations. At this point, we employ the following assumptions for the OFDM system we consider. • a CP is used; • the channel impulse response is shorter than the CP, in terms of their respective length;
  • 25. 12 • there is perfect synchronization between the transmitter and the receiver; • channel nosise is additive, white and complex Gaussian; • the fading is slowing enough for the channel to be considered constant during the transmission of one OFDM symbol. For a tractable analysis of OFDM systems, we take a common practice to use the simplified mathematical model. Though the first OFDM system was implemented by analogue technology, here we choose to investigate a discrete-time model of OFDM step by step since digital baseband synthesis is widely exploited for today’s OFDM systems. Figure 1.5 shows a block diagram of a baseband OFDM modem which is based on PHY (physical layer) of IEEE standard 802.11a [37]. Before describing the mathematical model, we define the symbols and notations used in this dissertation. Capital and lower-case letters denote signals in frequency domain and in time domain respectively. Arrow bar indicates a vector and boldface letter without an arrow bar represents a matrix. It is packed into a table as follows.
  • 26. Binary input u (m) data Channel Inter- QAM Pilot S (m) s (m) P/S Add coding leaving mapping insertion S/P DAC RF TX CP channel IFFT (TX) FFT (RX) y (m) r (m) Deinter- De Remove Decoding Detection P/S S/P ADC RF RX leaving mapping Y (m) CP Binary output data Channel Timing and estimation Synch. Figure 1.5: Block diagram of a baseband OFDM transceiver 13
  • 27. 14 Ap×q p × q matrix a column vector Ip p × p identity matrix 0 zero matrix diag(a) diagonal matrix with a’s elements on the diagonal ¯ AT transpose of A A∗ complex conjugate of A AH Hermitian of A tr(A) trace of A rank(A) rank of A det(A) determinant of A A⊗B Kronecker product of A and B As shown in Figure 1.5, the input serial binary data will be processed by a data scrambler first and then channel coding is applied to the input data to improve the BER (bit error rate) performance of the system. The encoded data stream is fur- ther interleaved to reduce the burst symbol error rate. Dependent on the channel condition like fading, different base modulation modes such as BPSK (binary phase shift keying), QPSK (quadrature phase shift keying) and QAM are adaptively used to boost the data rate. The modulation mode can be changed even during the trans- mission of data frames. The resulting complex numbers are grouped into column vectors which have the same number of elements as the FFT size, N . For simplicity of presentation and ease of understanding, we choose to use matrix and vector to describe the mathematical model. Let S(m) represent the m-th OFDM symbol in
  • 28. 15 the frequency domain, i.e.,    S(mN )   .  S(m) =  . .  ,     S(mN + N − 1) N ×1 where m is the index of OFDM symbols. We assume that the complex-valued elements {S(mN ), S(mN + 1), . . . , S(mN + N − 1)} of S(m) are zero mean and uncorrelated random variables whose sample space is the signal constellation of the base modula- tion (BPSK, QPSK and QAM). To achieve the same average power for all different mappings, a normalization factor KMOD [37] is multiplied to each elements of S(m) such that the average power of the different mappings is normalized to unity. To obtain the time domain samples, as shown by the IDFT block in Figure 1.5, an IFFT (inverse fast Fourier transform) operation is represented by a matrix multiplication. 2π Let FN be the N -point DFT matrix whose (p, q)-th elements is e−j N (p−1)(q−1) . The resulting time domain samples s(m) can be described by    s(mN )     . .  s(m) =   .     (1.1) s(mN + N − 1) N ×1 1 = ( N )FH S(m). N Compared to the costly and complicated modulation and multiplexing of conventional FDM systems, OFDM systems easily implement them by using FFT in baseband pro- cessing. To combat the multipath delay spread in wireless channels, the time-domain samples s(m) is cyclically extended by copying the last Ng samples and pasting them to the front, as shown in Figure 1.6(a) [6].
  • 29. 16 N Ng CP guard time FFT integration time (CP) (a) (b) Figure 1.6: (a) Concept of CP; (b) OFDM symbol with cyclic extension Let u(m) denote the cyclically extended OFDM symbol as   u(mNtot )        . .   CP  u(m) =   . =   ,   s(m) u(mNtot + Ntot − 1) Ntot ×1 where Ntot = N + Ng is the length of u(m). In the form of matrix, the CP insertion can be readily expressed as a matrix product of s(m) and an Ntot × N matrix ACP . By straight computation, it holds that u(m) = ACP s(m), (1.2) where    0 INg    ACP =  IN −Ng   0  .     0 INg (N +Ng )×N One of the challenges from the harsh wireless channels is the multipath delay spread. If the delay spread is relatively large compared to the symbol duration, then a delayed copy of a previous symbol will overlap the current one which implies severe ISI. To
  • 30. 17 eliminate the ISI almost completely, a CP is introduced for each OFDM symbol and the length of CP, Ng must be chosen longer than the experienced delay spread, L, i.e., Ng ≥ L. In addition, CP is capable of maintaining the orthogonality among subcarri- ers which implies zero ICI. It is because the OFDM symbol is cyclically extended and this ensures that the delayed replicas of the OFDM symbol always have an integer number of cycles within the FFT interval, as long as the delay is smaller than the CP. It is clearly illustrated in Figure 1.6(b). No matter where the FFT window starts, provided that it is within the CP, there will be always one or two complete cycles within FFT integration time for the symbol on top and at below respectively. In IEEE 802.11a standard [37], Ng is at least 16. The obtained OFDM symbol (including the CP) u(m), as shown in Figure 1.5, must be converted to the analogue domain by an DAC (digital-to-analog converter) and then up-converted for RF transmission since it is currently not practical to generate the OFDM symbol directly at RF rates. To re- main in the discrete-time domain, the OFDM symbol could be up-sampled and added to a discrete carrier frequency. This carrier could be an IF (intermediate frequency) whose sample rate is handled by current technology. It could then be converted to analog and increased to the final transmit frequency using analog frequency conver- sion methods. Alternatively, the OFDM modulation could be immediately converted to analog and directly increased to the desired RF transmit frequency. Either way has its advantages and disadvantages. Cost, power consumption and complexity must be taken into consideration for the selected technique.
  • 31. 18 The RF signal is transmitted over the air. For the wireless channel, it is assumed in this thesis as a quasi-static frequency-selective Rayleigh fading channel [71]. It indicates that the channel remains constant during the transmission of one OFDM symbol. Suppose that the multipath channel can be modeled by a discrete-time baseband equivalent (L−1)th-order FIR (finite impulse response) filter with filter taps {h0 , h1 , . . . , hl , . . . , hL−1 }. It is further assumed that the channel impulse response,i.e., the equivalent FIR filter taps, are independent zero mean complex Gaussian random variables with variance of 1 Pl per dimension. The ensemble of {P0 , . . . , Pl , . . . , PL−1 } 2 is the PDP (power delay profile) of the channel and usually the total power of the PDP is normalized to be 1 as the unit average channel attenuation. Denote the CIR (channel impulse response) vector hm as    h0,m     . .  hm =   .   ,   hL−1,m L×1 where the subscript m is kept to imply that the channel may vary from one OFDM symbol to the next one. Then the complex baseband equivalent received signal can be represented by a discrete-time convolution as L−1 r(mNtot + n) = hl,m u(mNtot + n − l) + v(mNtot + n), (1.3) l=0 where mNtot + n means the n-th received sample during the m-th OFDM symbol and 0 ≤ n ≤ Ntot − 1. The term v(mNtot + n) represents the complex AWGN at 1 2 the (mNtot + n)-th time sample with zero mean and variance of 2 σv per dimension. 1 Hence, the expected SNR (signal-to-noise ratio) per received signal is ρ = σv2. In
  • 32. 19 order for the parallel processing by the DFT block in Figure 1.5, we will rewrite the equation (1.3) into a matrix form. First we define      r(mNtot )   v(mNtot )       . .   . .  r(m) =   . ;  v(m) =   . ,  (1.4)     r(mNtot + Ntot − 1) v(mNtot + Ntot − 1) and     h0,m hL−1,m ··· h1,m  . ..   .   . . .   .. .     . .     ; hm,T oep =  . (c) hm,T oep =  hL−1,m ··· h0,m  hL−1,m  (1.5)      .. . . ..     . . .  hL−1,m ··· h0,m Then it is straight forward to have the following input-output relationship with regard to the channel (c) r(m) = hm,T oep u(m) + hm,T oep u(m − 1) + v(m). (1.6) It is easy to see in (1.6) that the first L−1 terms of r(m), i.e., {r(mNtot ), . . . , r(mNtot + (c) L − 2)}, will be affected by the ISI term hm,T oep u(m − 1) since the Toeplitz and upper (c) triangular matrix hm,T oep has non-zero entries in the first L − 1 rows. In order to remove the ISI term, we transform the Ntot × 1 vector r(m) into an N × 1 vector y(m) by simply cutting off the first Ng possibly ISI affected elements. For complete elimination of ISI, Ng ≥ L must be satisfied. It is a reverse operation of the cyclic extension as implemented in the transmitter side. Consistently this transformation
  • 33. 20 can also be expresses as matrix-vector product    y(mN )    . y(m) =   . .   = ADeCP r(m) , (1.7)     y(mN + N − 1) where ADeCP = 0 IN . N ×Ntot As shown in Figure 1.5, the ISI-free received signal y(m) is demodulated by FFT and hence it is converted back to the frequency domain received signal Y (m). It is described by    Y (mN )     . .  Y (m) =   .  = FN y(m) .  (1.8)   Y (mN + N − 1) After obtaining the received signal Y (m), symbol detection can be implemented if the channel state information is known or it can be estimated by some channel estimation algorithms. The detected symbol will pass through a series of reverse operations to retrieve the input binary information, corresponding to the encoding, interleaving and mapping in the transmitter side. Following the signal flow from the transmitted signal S(m) to the receive signal Y (m), a simple relationship between them can be expressed as Y (m) = Hm,diag S(m) + V (m), (1.9) where the diagonal matrix Hm,diag is    H0,m     Hm,diag =  ..   ; Hk,m = L−1 2π hl e−j N kl , 0 ≤ k ≤ N.  .  l=0   HN −1,m
  • 34. 21 and V (m) is the complex AWGN in frequency domain. This simple transmitter- and-receiver structure is well known in all the literatures [42, 46, 48, 49] and it is an important reason for the wide application of OFDM systems. The transmitted signal can be easily extracted by simply dividing the channel frequency response for the specific subcarrier. Hence it eliminates the needs of a complicated equalizer at the receive side. In this thesis, we do not directly jump on this known conclusion for two reasons. First, following through the baseband block diagram in Figure 1.5, we use a matrix form of presentation to describe all the input-output relationship with respect to each block. This gives us a clear and thorough understanding of all the signal processing within the OFDM system. It is a different view from those in literatures which can be summarized by the fact that the discrete Fourier transform of a cyclic convolution (IDFT(S(m)) and hm ) in time domain leads to a product of the frequency responses (S(m) and DFT(hm )) of the two convoluted terms. Second, this provides a base for our channel estimator design in the following chapter. Next, the simple relation in (1.9) is shown by going through the signal flow backwards from
  • 35. 22 Y (m) to S(m) that Y (m) = FN y(m) = FN (ADeCP r(m)) (c) = FN {ADeCP [hm,T oep u(m) + hm,T oep u(m − 1) + v(m)]} = FN [ADeCP hm,T oep u(m) + ADeCP v(m)] , (1.10) = FN [ADeCP hm,T oep ACP s(m) + ADeCP v(m)] 1 = FN [ADeCP hm,T oep ACP ( N )FH S(m) + ADeCP v(m)] N 1 = F [ADeCP hm,T oep ACP ]FH S(m) N N N + FN (ADeCP v(m)) 1 = N [FN hCir FH ]S(m) N + V (m) where V (m) = FN (ADeCP v(m)) and hCir = ADeCP hm,T oep ACP is an N × N circulant matrix with some special properties. It is parameterized as    h0,m 0 ··· ··· 0 hL−1,m hL−2,m · · · h1,m     h1,m h0,m 0 ··· 0 0 hL−1,m · · · h2,m       . . . . .. . . . . .. .    . . . . . . . . . . . . . .      hL−2,m · · · · · · h0,m 0 ··· ··· 0 hL−1,m        hm,Cir =  hL−1,m · · · ··· ··· h0,m 0 ··· ··· 0  .      0 hL−1,m · · · ··· ··· h0,m 0 ··· 0     . . . . . .   . .. .. . . . .. . .   . . . . . . . . .     . .. .. .. . . .. .. . .   . . . . . . . . . . . .      0 ··· ··· 0 hL−1,m hL−2,m · · · · · · h0,m N ×N (1.11) As stated in [38], an N × N circulant matrix has some important properties: • All the N × N circulant matrices have the same eigenvectors and they are the H columns of FN , where FN is the N -point FFT matrix; • The corresponding eigenvalues {λ1 , · · · , λN } are the FFT of the first column of the circulant matrix;
  • 36. 23 The first column of the circulant matrix hm,Cir is [hT , . . . , hT 0,m T L−1,m , 0, . . . , 0] . Hence, the eigenvalues of hm,Cir is      H0,m   h0,m     .    H1,m     . .     = FN  .  .     . .   h     L−1,m      HN −1,m 0(N −L)×1 Taking eigenvalue decomposition of hm,Cir , we have    H0,m  1   hm,Cir = FH  ..   FN . (1.12)  . N N   HN −1,m Simply substituting (1.12) into (1.10) shows that (1.9) is true. The simple model in (1.9) is widely exploited for theoretical research. It is, however, based on all of the assumptions we make at the beginning of this section. In the practical OFDM systems, a lot of efforts were made in research to keep the OFDM systems as close to this model as possible. Perfect synchronization in time domain and frequency domain is the most challenging subject. The orthogonality could be easily destroyed by a few factors such as the Doppler shift resulting from the relative movement between the transmitter and the receiver, the frequency mismatch between the oscillators at two ends, large timing errors and phase noise. Meanwhile, accurate channel state information is critical for reducing the BER and improving the system performance. Hence, joint channel estimation and synchronization with low complex- ity is an active research area for current OFDM systems. As long as the orthogonality is obtained, OFDM is a simple and efficient multicarrier data transmission technique.
  • 37. 24 1.2 Dissertation Contributions In the first part, this dissertation addresses one of the most fundamental problems in MIMO-OFDM communication system design, i.e., the fast and reliable channel esti- mation. By using the pilot symbols, a MIMO-OFDM channel estimator is proposed in this dissertation which is capable of estimating the time-dispersive and frequency- selective fading channel. Our contribution to this dissertation are as follows. • Great Simplicity: For an Nt ×Nr MIMO (Nt : number of transmit antennas,Nr : number of receive antennas) system, the complexity of any kinds of signal processing algorithms at the physical layer is increased usually by a factor of Nt Nr . Hence, simplicity plays an important role in the system design. We propose a pilot tone design for MIMO-OFDM channel estimation that Nt disjoint set of pilot tones are placed on one OFDM block at each transmit antenna. For each pilot tone set, it has L (L: channel length) pilot tones which are equally-spaced and equally- powered. The pilot tones from different transmit antennas comprise a unitary matrix and then a simple least square estimation of the MIMO channel is easily implemented by taking advantage of the unitarity of the pilot tone matrix. There is no need to compute the inverse of large-size matrix which is usually required by LS algorithm. Contrast to some other simplified channel estimation methods by assuming that there are only a few dominant paths among L of them
  • 38. 25 and then neglecting the rest weaker paths in the channel, our method estimates the full channel information with a reduced complexity. • Estimation of Fast Time-varying Channel: In a highly mobile environment, like a mobile user in a vehicle riding at more than 100km/hr, the wireless channel may change within one or a small number of symbols. But the information packet could contain hundreds of data symbols or even more. In the literature [50] there are some preamble designs that the wireless channel is only estimated at the preamble part of a whole data packet and is assumed to be constant during the transmission of the rest data part. Different from the preamble design, our scheme is proposed that we distribute the pilot symbols in the preamble to each OFDM block for channel estimation. Since the pilot tones are placed on each OFDM block, the channel state infor- mation can be estimated accurately and quickly, no matter how fast the channel condition is varying. • Link to SFC (Space-frequency code): Usually channel estimation and space-frequency code design of MIMO-OFDM systems are taken as two independent subject, especially for those algorithms generalized from their counterparts in the SISO (single-input single-output) case. Some researchers [48, 50] propose some orthogonal structures for pilot tone design and try to reduce the complexity of computing. However, each
  • 39. 26 individual structure is isolated and it is not easy to generalize their structures to the MIMO system with any number of transmit antennas and receive antennas. In this dissertation, the orthogonal pilot tone matrix we propose is indeed a space-frequency code. The row direction of the matrix stands for different pilot tone sets in the frequency domain, and the column direction represents the individual transmit antennas in spatial domain. And it can be readily extended to an Nt × Nr MIMO system by constructing an Nt × Nt orthogonal matrix. With this explicit relation to space-frequency code, the design of pilot-tone matrix for MIMO-OFDM channel estimation can be conducted in a more broad perspective. This link will shed light on each other. In the second part of this dissertation, we contribute to the formulation of the lo- cation estimation into a constrained LS-type optimization problem. As surveyed in [53], there are different methods for location estimation based on measurements of TOA, TDOA, AOA and amplitude. There are two problems which are not given full attention and may increase the complexity of the algorithm. One problem is that only an intermediate solution can be first obtained by solving the LS estimation problem. It means that the intermediate solution is still a function of the unknown target loca- tion. Extra constraints are needed to get the final target estimation. Though such a constraint exists, solving the quadratic equation may end up with nonexistence of a real positive root. Another problem is that it is unclear how the measurement noise variance affect the estimation accuracy. Intuitively, a small variance is always pre-
  • 40. 27 ferred. In our proposed algorithm, the constrained LS-type optimization problem is solved by using Lagrange multiplier. And it is pointed out that the noise variance is closely related to the equivalent SNR. For example, in the case of TDOA, the equiva- lent SNR is the ratio of the time for a signal traveling from the target to the k-th base station over the noise variance. A smaller noise variance then indicates a higher SNR which leads to more accurate location estimation. The formulation of a constrained LS-type optimization has its advantages. First it holds a performance which is close to the ML algorithm, provided that the assumption about the measurement noise variance is satisfied. Second it inherits the simplicity from the LS algorithm. 1.3 Organization of the Dissertation This dissertation is organized as follows. In Chapter 1, the principle of OFDM is illustrated through instructive figures and the signal mode of OFDM systems is de- scribed by matrix representation in details. Also, a review of research on channel estimation for OFDM systems is covered in Chapter 1. In Chapter 2, it is mainly focused on the pilot tone based channel estimation of MIMO-OFDM systems. It ends up with intensive computer simulations of different estimation algorithms and effects of some key OFDM parameters on estimator performance. Chapter 3 devotes to wireless location on WiMax network. A constrained LS-type optimization problem is formulated under a mild assumption and it is solved by using Lagrange multiplier method. Finally this dissertation is summarized in Chapter 5 by suggesting some open research subjects on the way.
  • 41. Chapter 2 MIMO-OFDM Channel Estimation 2.1 Introduction With the ever increasing number of wireless subscribers and their seemingly “greedy” demands for high-data-rate services, radio spectrum becomes an extremely rare and invaluable resource for all the countries in the world. Efficient use of radio spectrum requires that modulated carriers be placed as close as possible without causing any ICI and be capable of carrying as many bits as possible. Optimally, the bandwidth of each carrier would be adjacent to its neighbors, so there would be no wasted bands. In practice, a guard band must be placed between neighboring carriers to provide a guard space where a shaping filter can attenuate a neighboring carrier’s signal. These guard bands are waste of spectrum. In order to transmit high-rate data, short symbol periods must be used. The symbol period Tsym is the inverse of the baseband data rate R (R = 1/Tsym ), so as R increases, Tsym must decrease. In a multipath environment, however, a shorter symbol period leads to an increased degree of ISI, and thus performance loss. OFDM addresses both of the two problems with its 28
  • 42. 29 unique modulation and multiplexing technique. OFDM divides the high-rate stream into parallel lower rate data and hence prolongs the symbol duration, thus helping to eliminate ISI. It also allows the bandwidth of subcarriers to overlap without ICI as long as the modulated carriers are orthogonal. OFDM therefore is considered as a good candidate modulation technique for broadband access in a very dispersive environments [42, 43]. However, relying solely on OFDM technology to improve the spectral efficiency gives us only a partial solution. At the end of 1990s, seminal work by Foshini and Gans [21] and, independently, by Teltar [22] showed that there is another alternative to accomplish high-data-rate over wireless channels: the use of multiple antennas at the both ends of the wireless link, often referred to as MA (multiple antenna) or MIMO in the literature [21, 22, 17, 16, 25, 26]. The MIMO technique does not require any bandwidth expansions or any extra transmission power. Therefore, it provides a promising means to increase the spectral efficiency of a system. In his paper about the capacity of multi-antenna Gaussian channels [22], Telatar showed that given a wireless system employing Nt TX (transmit) antennas and Nr RX (receive) anten- nas, the maximum data rate at which error-free transmission over a fading channel is theoretically possible is proportional to the minimum of Nt and Nr (provided that the Nt Nr transmission paths between the TX and RX antennas are statistically in- dependent). Hence huge throughput gains may be achieved by adopting Nt × Nr MIMO systems compared to conventional 1 × 1 systems that use single antenna at
  • 43. 30 both ends of the link with the same requirement of power and bandwidth. With multiple antennas, a new domain,namely, the spatial domain is explored, as opposed to the existing systems in which the time and frequency domain are utilized. Now let’s come back to the previous question: what can be done in order to en- hance the data rate of a wireless communication systems? The combination of MIMO systems with OFDM technology provides a promising candidate for next generation fixed and mobile wireless systems [42]. In practice for coherent detection, however, accurate channel state information in terms of channel impulse response (CIR) or channel frequency response (CFR) is critical to guarantee the diversity gains and the projected increase in data rate. The channel state information can be obtained through two types of methods. One is called blind channel estimation [44, 45, 46], which explores the statistical in- formation of the channel and certain properties of the transmitted signals. The other is called training-based channel estimation, which is based on the training data sent at the transmitter and known a priori at the receiver. Though the former has its advantage in that it has no overhead loss, it is only applicable to slowly time-varying channels due to its need for a long data record. Our work in this thesis focuses on the training-based channel estimation method, since we aim at mobile wireless ap- plications where the channels are fast time-varying. The conventional training-based method [47, 48, 50] is used to estimate the channel by sending first a sequence of OFDM symbols, so-called preamble which is composed of known training symbols.
  • 44. 31 Then the channel state information is estimated based on the received signals cor- responding to the known training OFDM symbols prior to any data transmission in a packet. The channel is hence assumed to be constant before the next sequence of training OFDM symbols. A drastic performance degradation then arises if applied to fast time-varying channels. In [49], optimal pilot-tone selection and placement were presented to aid channel estimation of single-input/single-output (SISO) systems. To use a set of pilot-tones within each OFDM block, not a sequence of training blocks ahead of a data packet to estimate the time-varying channel is the idea behind our work. However direct generalization of the channel estimation algorithm in [49] to MIMO-OFDM systems involves the inversion of a high-dimension matrix [47] due to the increased number of transmit and receive antennas, and thus entails high complex- ity and makes it infeasible for wireless communications over highly mobile channels. This becomes a bottleneck for applications to broadband wireless communications. To design a low-complexity channel estimator with comparable accuracy is the goal of this chapter. The bottleneck problem of complexity for channel estimation in MIMO-OFDM systems has been studied by two different approaches. The first one shortens the sequence of training symbols to the length of the MIMO channel, as described in [50], leading to orthogonal structure for preamble design. Its drawback lies in the increase of the overhead due to the extra training OFDM blocks. The second one is the simpli- fied channel estimation algorithm, as proposed in [48], that achieves optimum channel
  • 45. 32 estimation and also avoids the matrix inversion. However its construction of the pilot- tones is not explicit in terms of space-time codes (STC). We are motivated by both approaches in searching for new pilot-tone design. Our contribution in this chapter is the unification of the known results of [48, 50] in that the simplified channel esti- mation algorithm is generalized to explicit orthogonal space-frequency codes (SFC) that inherit the same computational advantage as in [48, 50], while eliminating their respective drawbacks. In addition, the drastic performance degradation occurred in [48, 50] is avoided by our pilot-tone design since the channel is estimated at each block. In fact we have formulated the channel estimation problem in frequency domain, and the CFR is parameterized by the pilot-tones in a convenient form for design of SFC. As a result a unitary matrix, composed of pilot-tones from each transmit antenna, can be readily constructed. It is interesting to observe that the LS algorithm based on SFC in this paper is parallel to that for conventional OFDM systems with single transmit/receive antenna. The use of multiple transmit/receive antennas offers more design freedom that provides further improvements on estimation performance. 2.2 System Description The block diagram of a MIMO-OFDM system [27, 28] is shown in Figure 2.1. Ba- sically, the MIMO-OFDM transmitter has Nt parallel transmission paths which are very similar to the single antenna OFDM system, each branch performing serial-to- parallel conversion, pilot insertion, N -point IFFT and cyclic extension before the final TX signals are up-converted to RF and transmitted. It is worth noting that
  • 46. 33 the channel encoder and the digital modulation, in some spatial multiplexing systems [28, 29], can also be done per branch, not necessarily implemented jointly over all the Nt branches. The receiver first must estimate and correct the possible symbol timing error and frequency offsets, e.g., by using some training symbols in the preamble as standardized in [37]. Subsequently, the CP is removed and N -point FFT is performed per receiver branch. In this thesis, the channel estimation algorithm we proposed is based on single carrier processing that implies MIMO detection has to be done per OFDM subcarrier. Therefore, the received signals of subcarrier k are routed to the k- th MIMO detector to recover all the Nt data signals transmitted on that subcarrier. Next, the transmitted symbol per TX antenna is combined and outputted for the subsequent operations like digital demodulation and decoding. Finally all the input binary data are recovered with certain BER. As a MIMO signalling technique, Nt different signals are transmitted simultane- ously over Nt × Nr transmission paths and each of those Nr received signals is a combination of all the Nt transmitted signals and the distorting noise. It brings in the diversity gain for enhanced system capacity as we desire. Meanwhile compared to the SISO system, it complicates the system design regarding to channel estimation and symbol detection due to the hugely increased number of channel coefficients. 2.2.1 Signal Model To find the signal model of MIMO-OFDM system, we can follow the same approach as utilized in the SISO case. Because of the increased number of antennas, the signal
  • 47. CP 1 1 1 P/S S/P IFFT Data Channel Digital MIMO source encoder modulator encoder CP Nt Nr Nt P/S S/P Timing and Frequency IFFT Synchronization De-CP 1 S/P P/S FFT Data Channel Digital MIMO sink decoder demodulator decoder De-CP Figure 2.1: Nt × Nr MIMO-OFDM System model Nr S/P P/S FFT Channel estimation 34
  • 48. 35 dimension is changed. For instance, the transmitted signal on the k-th subcarrier in a MIMO system is an Nt × 1 vector, instead of a scalar in the SISO case. For brevity of presentation, the same notations are used for both the SISO and MIMO cases. But they are explicitly defined in each case. There are Nt transmit antennas and hence on each of the N subcarriers, Nt modulated signals are transmitted simultaneously. Denote S(m) and S(mN + k) as the m-th modulated OFDM symbol in frequency domain and the k-th modulated subcarrier respectively as      S(mN )   S1 (mN + k)      . . S(m) =   . .   S(mN + k) =   . .  , (2.1)         S(mN + N − 1) SNt (mN + k) where Sj (mN + k) represents the k-th modulated subcarrier for the m-th OFDM symbol transmitted by the j-th antenna. And it is normalized by a normalization factor KMOD so that there is a unit normalized average power for all the mappings. Taking IFFT of S(m) as a baseband modulation, the resulting time-domain samples can be expressed as      s(mN )   s1 (mN + n)      . . s(m) =   . .   s(mN + n) =   . .       (2.2)     s(mN + N − 1) sNt (mN + n) 1 = N (FH N ⊗ INt )S(m) . Here IFFT is a block-wise operation since each modulated subcarrier is a column vector and the generalized N Nt -point IFFT matrix is a Kronecker product of FN and INt . This is just a mathematical expression. In the real OFDM systems, however, the generalized IFFT operation is still performed by Nt parallel N -point IFFT. To
  • 49. 36 eliminate the ISI and the ICI, a length-Ng (Ng ≥ L) CP is prepended to the time- domain samples per branch. The resulting OFDM symbol u(m) is denoted as      u(mNtot )   u1 (mNtot + n)       . .   . .  u(m) =  .  u(mNtot + n) =  . . (2.3)         u(mNtot + Ntot − 1) uNt (mNtot + n) In a matrix form, there holds u(m) = ACP s(m), (2.4) where    0 INg      ACP =  IN −Ng  0  ⊗ INt .    0 INg The time-domain samples denoted by u(m) may be directly converted to RF for transmission or be up-converted to IF first and then transmitted over the wireless MIMO channel. For the MIMO channel, we assume in this thesis that the MIMO- OFDM system is operating in a frequency-selective Rayleigh fading environment and that the communication channel remains constant during a frame transmission, i.e., quasi-static fading. Suppose that the channel impulse response can be recorded with L time instances, i.e., time samples, then the multipath fading channel between the j-th TX and i-th RX antenna can be modeled by a discrete-time complex base- band equivalent (L − 1)-th order FIR filter with filter coefficients hij (l, m), with l ⊆ {0, . . . , L − 1} and integer m > 0. As assumed in SISO case, these CIR coef- ficients {hij (0, m), . . . , hij (L − 1, m)} are independent complex zero-mean Gaussian 1 RV’s with variance 2 Pl per dimension. The total power of the channel power delay
  • 50. 37 2 profile {P0 , . . . , PL−1 } is normalized to be σc = 1. Let hm be the CIR matrix and denote hl,m as the l-th matrix-valued CIR coefficient.      h0,m   h11 (l, m) ··· h1Nt (l, m)       . .   . . ... . .  hm =   . ;  hl,m =   . . .  (2.5)     hL−1,m hNr 1 (l, m) · · · hNr Nt (l, m) In addition, we assume that those Nt Nr geographically co-located multipath channels are independent in an environments full of scattering. In information-theoretic point of view [21, 22], it guarantees the capacity gain of MIMO systems. For the practical MIMO-OFDM systems, it enforces a lower limit on the shortest distance between multiple antennas at a portable receiver unit. If the correlation between those chan- nels exists, the diversity gain from MIMO system will be reduced and hence system performance is degraded. At the receive side, an Nr -dimensional complex baseband equivalent receive signal can be obtained by a matrix-based discrete-time convolution as L−1 r(mNtot + n) = hl,m u(mNtot + n − l) + v(mNtot + n), (2.6) l=0 where      r1 (mNtot + n)   v1 (mNtot + n)      . . r(mNtot + n) =   . .   v(mNtot + n) =   . .   .         rNr (mNtot + n) vNr (mNtot + n) Note that vi (mNtot +n) is assumed to be complex AWGN with zero mean and variance of 1 σv per dimension. Therefore, the expected signal-to-noise ratio (SNR) per receive 2 2 Nt antenna is 2. σv In order to have a fair comparison with SISO systems, the power
  • 51. 38 per TX antenna should be scaled down by a factor of Nt . By stacking the received samples at discrete time instances, r(m) can be described by    r(mNtot )    . r(m) =   . .  . (2.7)     r(mNtot + Ntot − 1) To combat the ISI, the first Ng Nr elements of r(m) must be removed completely. The resulting ISI-free OFDM symbol y(m) is    y(mN )    . y(m) =   . .   = ADeCP r(m), (2.8)     y(mN + N − 1) where ADeCP = 0 IN ⊗ INr . By exploiting the property that u(m) is a cyclic extension of s(m) so that cyclic discrete-time convolution is valid, the relation between s(m) and y(m) can be ex- pressed as y(m) = hm,Cir s(m) + ADeCP v(m), (2.9) where hm,Cir is an N Nr × N Nt block circulant matrix. In general, an N Nr × N Nt block circulant matrix is fully defined by its first N Nr × Nt block matrices. In our case, hm,Cir is determined by    h0,m   .    . .    .    hL−1,m      0(N −L)Nr ×Nt
  • 52. 39 Finally taking FFT on the y(m) at the receiver, we obtain the frequency domain MIMO-OFDM baseband signal model Y (m) = (FN ⊗ INr )y(m) = (FN ⊗ INr )(hm,Cir s(m) + ADeCP v(m)) (2.10) 1 = ( N )(FN ⊗ INr )hm,Cir (FH ⊗ INt )S(m) + (FN ⊗ INr )ADeCP v(m) N = Hm,diag S(m) + V (m). In the above expression, V (m) represents the frequency domain noise, which is i.i.d. (independent and identically distributed) zero-mean and complex Gaussian random 1 2 variable with variance 2 σv per dimension, and Hm,diag is a block diagonal matrix which is given by    H0,m     Hm,diag =  ...  .     HN −1,m The k-th block diagonal element is the frequency response of the MIMO channel at L−1 2π the k-th subcarrier and can be shown to be Hk,m = l=0 hl,m e−j N kl . So for that subcarrier, we may write it in a simpler form Y (mN + k) = Hk,m S(mN + k) + V (mN + k), (2.11) where    H11 (k, m) ··· H1Nt (k, m)     . . ... . .  Hk,m =   . . .    HNr 1 (k, m) · · · HNr Nt (k, m) This leads to a flat-fading signal model per subcarrier and it is similar to the SISO signal model, except that Hk,m is an Nr × Nt matrix.
  • 53. 40 2.2.2 Preliminary Analysis Based on those assumptions such as perfect synchronization and block fading, we end up with a compact and simple signal model for both the single antenna OFDM and MIMO-OFDM systems. Surely it is an ideal model that says, considering first a noise free scenario, the received signal on the k-th subcarrier is just a product (or matrix product for MIMO case) of the transmitted signal on the k-th subcarrier and the discrete-time channel frequency response at the k-th subcarrier. Noise in frequency domain can also be modeled as an additive term. When it comes to channel estimation for OFDM systems, this model is still valid since there is no ICI as we assume. For channel estimation of MIMO-OFDM systems, it is appropriate to estimate the channel in time domain rather than in frequency domain because there are few parameters in the impulse response (Nt Nr L coefficients) than in the frequency re- sponse (Nt Nr N coefficients). Given the limited number of training data that can be sent to estimate the fast time-varying channel, limiting the number of parameters to be estimated would increase the accuracy of the estimation. This is the thrust of the estimation technique in this thesis. The estimation algorithm we propose is based on pilot tones, namely known data in the frequency domain. Since the signal model of OFDM in (2.11) is in the frequency domain too, it is necessary to find the relations between the CFR and the CIR. Discrete-time Fourier transform is a perfect tool we
  • 54. 41 can use to describe the relation. It is shown as   hm H m = F N Nr    , 0(N −L)Nr ×Nt where    H0,m    . Hm =   . .  ; FN Nr = FN ⊗ INr .     HN −1,m Since the channel length L is less than the FFT size N , only the first LNr columns of FFT matrix FN Nr are involved in calculation. It gives us another form to describe the relation as Hm = FN Nr (1 : Nr L)hm , (2.12) where FN Nr (:, 1 : Nr L) is an N Nr × Nr L submatrix of FN , consisting of its first Nr L columns. FN Nr (:, 1 : Nr L) is a ’tall’ matrix and its left inverse exists. That implies the equation in (2.12) is an overdetermined system. To determine hm , we can easily multiply the left inverse of FN Nr (:, 1 : Nr L) in the two sides of the equation. This requires full information for the channel frequency response matrix Hm . That is not necessarily to be true. Actually if we know L of the N matrices {H0,m , . . . , HN −1,m }, then hm can be calculated. For example, in the SISO case, if we know the channel frequency response at any L subcarriers {Hk1 ,m , . . . , HkL ,m }, then the channel impulse response h(m) can be uniquely determined. This is the base for pilot-tone based channel estimation of OFDM systems. Pilot-tones are the selected subcarriers over which the training data are sent. The question then arises as to which tones should be used as pilot-tones and the impact of pilot-tones selection on the quality of estimation.
  • 55. 42 Cioffi’s paper [49] addressed this issue first that one should choose the sets of equally- spaced tones as pilot tones, to avoid the noise enhancement effect in interpolating the channel impulse response from the frequency response. Assume that N = mL and the integer m > 1. This is a realistic assumption since the OFDM block size N is often chosen to be 128, 256 or even a larger value and the channel length of MIMO-OFDM channel is usually not greater than 30. For the typical urban (TU) model [47] of delay profile with RMS delay τrms = 1.06µs, the channel length is L = τrms × 20MHz+1 ≈ 23 in an 802.11a system with a bandwidth of 20MHz. In systems like DVB-T and WiMax [40, 41], N is even a much bigger integer. Since N = M L, there could be m equally-sized pilot tones sets. Define      Hp,m   1      (p)  . .  (p)  p  Hm =   .   WN =   WN  ⊗ INr ,  (2.13)    p(L−1)  Hp+(L−1)M,m WN 2π where p is any integer such that 0 ≤ p ≤ m − 1 and WN = e−j N . Clearly H(p) is m (p) (p) the p-th down-sampled version of Hm , and WN simply acts as a shift operator of order p. The CFR matrix Hm can be decomposed into M disjoint down-sampled submatrices {H(p) }M −1 , each composed of L equally-spaced CFR sample matrices. It m p=0 can be verified via straightforward calculation that (p) (p) Hm = FLNr WN hm p = 0, 1, · · · , M − 1, (2.14) where FLNr is a LNr × LNr DFT matrix. It indicates that the channel state infor- mation represented by hm can be obtained from a down-sampled version of Hm , i.e.,
  • 56. 43 (p) Hm , which only requires us to probe the unknown channel frequency response with some training data on the selected p-th pilot-tones set. The procedure of pilot-tone based channel estimation is illustrated in Figure 2.2. S ( P) (m) Y ( P) (m) S ( P) (m)hC (m) V ( P) (m) hC (m) Y ( P) (m) Figure 2.2: The concept of pilot-based channel estimation And it is also true that (p) H(p) (:, i) = FLNr WN hm (:, i), m (2.15) where H(p) (:, i) and hm (:, i) are the i-th column of H(p) and hm respectively and 1 ≤ m m i ≤ Nt . After discussing the relation between the CIR hm and the p-th down-sampled CFR Hm , we return to the input-output relationship of MIMO-OFDM system Y (mN + k) = Hk,m S(mN + k) + V (mN + k), (2.16) where       Y1 (mN + k) S1 (mN + k) V1 (mN + k)  .   .   .  Y (mN + k) =  .  ; S(mN + k) =  .  ; V (mN + k) =  . .  .   .   .  YNr (mN + k) SNt (mN + k) VNr (mN + k)
  • 57. 44 are the received signal, the transmitted signal and the noise term respectively as defined in the previous section. They are repeated here for convenience. In order to get a useful form for channel estimation based on pilot-tones, we have to manipulate the expression in (2.16) so that the transmitted signal and the CFR terms exchange their position in the product. (2.16) can be equivalently rewritten as Y (mN + k) = S1 (mN + k)Hk,m (:, 1) + · · · + SNt (mN + k)Hk,m (:, Nt ) + V (mN + k). (2.17) Basically we transform the product of a matrix and a vector into a summation of products of a scalar and a vector. The noise term remains unchanged. This trans- formation is specified to the k-th subcarrier. If we consider all the N subcarriers, we need stack {Y (mN + k)}’s and {Hm (:, i)}’s together and construct a block diagonal matrix for the {S(mN + k)}’s. It can be shown that Y (m) = Sdiag,1 (m)Hm (:, 1) + · · · + Sdiag,Nt (m)Hm (:, Nt ) + V (m), (2.18) where Y (m) and V (m) are the received signal and the noise term respectively given by       Y (mN ) V (mN ) H0,m (:, i)  .   .   .  Y (m) =  .  ; V (m) =  . ; Hm (:, i) =  . ,  .   .   .  Y (mN + N − 1) V (mN + N − 1) HN −1m (:, i) and   Sdiag,i (mN )   Sdiag,i (m) =   .. .  ⊗ IN ;  r 1 ≤ i ≤ Nt . Sdiag,i (mN + N − 1) Here the dimensions of the above column vectors and matrices are very large, for instance, Y (m) is an N Nr × 1 column vector. The computational load, however, is
  • 58. 45 not changed since Sdiag,i (m) is a block diagonal matrix, compared to the expression in (2.10). As proved in [49], pilot-tones should be equally-powered and equally-spaced to achieve the MMSE (minimum mean squared error) of channel estimation. Let {Si (mN + p), Si (mN + M + p), · · · , Si (mN + (L − 1)M + p)} represent a set of L pilot-tones with index p which are transmitted simultaneously along with the other N − L data signals at the m-th block from the i-th antenna. Obviously one pilot-tone is placed every M subcarriers in one OFDM block. Hence we can also have a down-sampled version of equation (2.18) by selecting a sampled element every M subcarriers. Since we assume that there is no ICI, we can neglect the data symbol which are transmitted together with pilot symbol. We only consider the p-th set of pilot-tones on the p-th, the (p + M )-th,... and the (p + (L − 1)M )-th subcarriers, and so are the received signals. It turns out to be (p) (p) Y (p) (m) = Sdiag,1 (m)Hm (:, 1) + · · · + Sdiag,Nt (m)H(p) (:, Nt ) + V (p) (m), (p) m (2.19) where       Y (mN + p) V (mN + p) Hp,m (:, i) Y (p) (m) =  . . ; V (p) (m) =  . . ; (p) Hm (:, i) =  . . , . . . Y (mN + (L − 1)M + p) V (mN + (L − 1)M + p) H(L−1)M +p,m (:, i) and   Sdiag,i (mN + p) (p) Sdiag,i (m) =  .. .  ⊗ IN ; r 1 ≤ i ≤ Nt Sdiag,i (mN + (L − 1)M + p) are all the p-th down-sampled versions. In the equation (2.19), we obtain the relation between Y (p) (m) and H(p) (:, i). To estimate the channel in time domain, we need m
  • 59. 46 explicitly relate Y (p) (m) with hm . Plugging (2.15) into (2.19) yields (p) (p) (p) (p) Y (p) (m) = Sdiag,1 (m)FLNr WN hm (:, 1) + · · · + Sdiag,Nt (m)FLNr WN hm (:, Nt ) + V (p) (m). (2.20) To estimate those unknown {hm (:, 1), · · · , hm (:, Nt )}, one set of pilot-tones is not ad- equate for estimation. That is different from the SISO case in which any one of the M pilot-tone sets can be utilized to estimate the channel. For MIMO-OFDM channel es- timation, we need, at least, Nt disjoint sets of pilot-tones indexed by {p1 , p2 , . . . , pNt }. It is assumed that N = M L and hence there are totally M = N/L different sets. It indicates a constraint imposed on the selection of FFT size N for MIMO systesm, i.e., N ≥ Nt L. This observation tallies with the result in [48]. In practice, the selection of N determines the number of subcarriers utilized in the system. For systems like WLAN and WiMax [39, 40], N is not very large because a larger N means narrower subcarrier spacing which may cause severe ICI. Furthermore, those systems often operate in a low SNR environments. 2.3 Channel Estimation and Pilot-tone Design 2.3.1 LS Channel Estimation Assume that we have Nt disjoint sets of pilot-tones. Then we have the following observation equations. (p ) (p ) (p ) (p ) Y (p1 ) (m) = Sdiag,1 (m)FLNr WN 1 hm (:, 1) + · · · + Sdiag,N (m)FLNr WN 1 hm (:, Nt ) + V (p1 ) (m) 1 1 t . . . (2.21) (pNt ) (pNt ) (pN ) (pNt ) (pN ) (pNt ) Y (m) = Sdiag,1 (m)FLNr WN t hm (:, 1) + ··· + Sdiag,N (m)FLNr WN t hm (:, Nt ) +V (m) t To use LS (least square) method for channel estimation, we usually put those obser- vation equations into a matrix form. LS is a well-known method and widely used for
  • 60. 47 estimation. We choose LS rather than other methods like MMSE channel estimation for the simplicity of implementation. In a matrix form, it is described by Y (P ) (m) = S(P ) (m)hC (m) + V (P ) (m), (2.22) where       (p1 ) (m) (p1 ) (m)  Y   hm (:, 1)   V   .   .   .  Y (P ) (m) =  . .  ; hC (m) =  . .  ; V (P ) (m) =  . . ,             Y (pNt ) (m) hm (:, Nt ) V (pNt ) (m) and   (p ) (p ) (p ) (p )  Sdiag,1 (m)FLNr WN 1 1 ··· Sdiag,Nt (m)FLNr WN 1 1   . .. .  S(P ) (m) =  . . . . . .     (p ) (p ) (p ) (p ) Sdiag,1 (m)FLNr WN Nt Nt · · · Sdiag,Nt (m)FLNr WN Nt Nt In the above expression, S(P ) (m) is an Nt Nr L×Nt Nr L square matrix, composed of Nt2 (p ) pilot-tone block matrices {Sdiag,j (m)}Nt . At each transmit antenna Nt sets of pilot- i i,j=1 tones are transmitted with the same index {p1 , p2 , · · · , pNt }. Assume that Nt ≤ M = N L . It can also be seen that the total number of unknown CIR parameters Nt Nr L cannot be greater than the total number of received signals N Nr , i.e., N tN rL ≤ N N N r ⇔ Nt L ≤ N ⇔ Nt ≤ L . The standard solution to the LS channel estimates [50] is known as ˆ hC,LS (m) = [(S(P ) (m))H S(P ) (m)]−1 (S(P ) (m))H Y (P ) (m). (2.23) Obviously the matrix S(P ) (m) is of huge size and it has Nt2 Nr L2 elements. Compu- 2 tation of the inverse for such a large size matrix is undesirable. Therefore, an intu- itive solution is to design the square matrix S(P ) (m) such that (S(P ) (m))H S(P ) (m) =
  • 61. 48 1 S(P ) (m)(S(P ) (m))H = aINt Nr L , a ∈ R+ , or equivalently √ S(P ) (m) a is a unitary ma- trix. Then the LS channel estimates can be easily obtained as ˆ 1 hC,LS (m) = hC,LS (m) + (S(P ) (m))H V (P ) (m). (2.24) a 2.3.2 Pilot-tone Design In order to have a simple and efficient LS algorithm for channel estimation, we have to design the square matrix S(P ) (m) deliberately. In this section, the design will be illustrated by a theorem and an example. The preamble design discussed in [50] adopted Tarokh’s approach [18] to space- time block code construction. It could be related to orthogonal design to which our pilot-tone design also has a connection. In each of the first Nt training blocks in a frame, a group of at least L pilot-tones are equally-placed and all the other tones are set to zeros. LS channel estimation can then be obtained based on the known pilot-tones. The channel is assumed to be unchanged for the rest of the whole frame. In a mobile environment, however, we cannot guarantee that the channel state information estimated at the m-th block still holds true at the (m + Nt )-th block. Hence the preamble design in [50] is not suitable to be applied to the fast time-varying channels. In addition to this common disadvantage, the training sequences designed in [48] have to satisfy a condition called local orthogonality. It requires that, for the Nt different training sequences with length N , they are orthogonal over the minimum set of elements for any starting position. The pilot design proposed in this paper aims to remove the disadvantage and the constraint mentioned above. It actually has its
  • 62. 49 roots to Table I in [16], but it is not implemented in space and time domain. On the contrary, it is accomplished in space and frequency domain. We explicitly connect pilot-tone design with space-frequency coding so that we have more insights on its design. Denote EP as the fixed total power for all the pilot-tones at each transmit EP antenna. Then the power allocated on each pilot-tone is Nt L since pilot-tones are all equalspaced and equalpowered. In some systems, the power of those pilot-tones could be larger than the power of data symbols for a better estimation of the wireless channel. We assume in our work that the pilot-tones and other data are all equally normalized such that the average power for all different mappings is the same. Our pilot-tone design is illustrated in the following theorem. (p ) EP Theorem 2.1 Let Sdiag,j (m) = αpi ,j ILNr , |αpi ,j | = i Nt L , i, j = 1, 2, · · · , Nt , then √1 S(P ) (m) is a unitary matrix if EP   (p1 ) (p1 )  Sdiag,1 (m) ··· Sdiag,Nt (m)  L   (P )  . . ... . .  SSF C (m) =  . .  EP    (pNt ) (p Nt )  Sdiag,1 (m) · · · Sdiag,Nt (m) is a unitary matrix.
  • 63. 50 Proof. S(P ) (m)   (p1 ) (p ) (p1 ) (p )  FLNr Sdiag,1 (m)WN 1 ··· FLNr Sdiag,Nt (m)WN 1    . ... . =  . . . .      (p ) (p ) (pNt ) (pNt )  Nt FLNr Sdiag,1 (m)WN Nt · · · FLNr Sdiag,Nt (m)WN   (p1 ) (p1 ) (p ) (p )  FLNr WN Sdiag,1 (m) ··· FLNr WN 1 Sdiag,Nt (m) 1     . . .. . .  =  . . .    (pN ) (pN ) (p ) (p )  FLNr WN t Sdiag,1 (m) · · · FLNr WN Nt Sdiag,Nt (m) t Nt (P ) EP (P ) = FLNr WN ( L )SSF C (m), where     (p )  FLNr   WN 1      FLNr =   ...  (P )  , WN =   ...  .        (pNt )  FLNr WN H H (P ) It is easy to see that FLNr FLNr = FLNr FLNr = LINt Nr L and WN is a unitary matrix. Hence S(P ) (m)(S(P ) (m))H = (S(P ) (m))H S(P ) (m) = EP INt Nr L . This completes the proof. 2 Clearly each of the Nt different pilot-tone sets has the same L elements. That is because, for example, An×n Bn×n = Bn×n An×n if B = In . Or put it in another way, we can turn the product AB into BA by moving B to the front of A. It is a simple manipulation of the mathematical derivation. In general, the product of two square matrices, AB is not equal to BA. But it turns out to be true if B is a square identity matrix. Then we can find that this assumption greatly simplifies the pilot-tone design for a MIMO-OFDM system with a large number of transmit antennas. It reduces to the design of a square orthogonal matrix. Hence we are more interested in the design
  • 64. 51 (P ) of SSF C (m). First we consider a simple example with 2 transmit antennas and 2 receive antennas, i.e., Nt = Nr = 2 in the previous equations. Assume the channel length L = 4. By Theorem 2.1, we use Alamouti’s structure [16]    x y  EP   , |x|2 + |y|2 = 4 , x, y ∈ C. ∗ ∗ −y x The above leads to the design   (p1 ) (p1 ) (P ) 4  Sdiag,1 (m) Sdiag,2 (m)  SSF C (m) =  , (2.25) EP S(p2 ) (m) S(p2 ) (m) diag,1 diag,2 where (p ) (p ) Sdiag,1 (m)) = xI8 , 1 Sdiag,2 (m)) = yI8 1 (p ) (p ) Sdiag,1 (m) = −y ∗ I8 , Sdiag,2 (m) = x∗ I8 . 2 2 The placement of pilot-tones in the example is shown in Figure 2.3. It can be seen in the figure that red and purple square boxes symbol the first and the second pilot- tone sets for TX antenna 1 respectively, and so are the green and light blue for TX antenna 2. They are all equally-spaced and the same color for each set implies that they are the same pilot symbols. For this example, there are total 16 pilot-tones and they are allocated to two TX antennas easily by our proposed method. The (P ) square matrix SSF C (m) is actually a space-frequency code. In the column direction, it is signified by the TX antennas, namely the spatial domain; In the row direction, it is denoted by different pilot-tone sets, namely the frequency domain. Hence our design explicitly clarifies the connection between conventional pilot-tone design and the space-frequency code design [32, 33] aiming at performance enhancement. When we have more than 2 transmit antennas, i.e., Nt ≥ 3, it is also very easy
  • 65. 52 Tx_1 Tx_2 m-th OFDM symbol (m+1)-th OFDM symbol 1 8 : 1st pilot set @ Tx_1 : 2nd pilot set @ Tx_1 16 : 1st pilot set @ Tx_2 : 2nd pilot set @ Tx_2 : data 24 32 (m+2)-th OFDM symbol Figure 2.3: Pilot placement with Nt = Nr = 2
  • 66. 53 (P ) to design an Nt Nr L × Nt Nr L unitary matrix SSF C (m). Based on the assumption in Theorem 2.1 that all the pilot-tones within one set are all the same, the design (P ) of SSF C (m) can be simplified to the design of an Nt × Nt unitary matrix S and the complexity is reduced from Nt Nr L to Nt :    αp1 ,1 ··· αp1 ,Nt  L    . . .. . .  S=  . . .  . EP     αpNt ,1 · · · αpNt ,Nt Nt ×Nt −˜ N ij j 2π √ Choose αpi ,j = EP LNt e t , ∀i, j ∈ {1, 2, . . . , Nt }, ˜ = j −1. Then S can be shown to be a unitary matrix. Basically it is very close to an Nt -point FFT matrix. After (P ) obtaining the {αpi ,j }Nt , SSF C (m) can be easily constructed from Theorem 2.1 by i,j=1 mapping a scalar to a diagonal matrix with its diagonal elements all equal to that scalar. 2.3.3 Performance Analysis With the fixed total power EP , the pilot-tones designed in the previous section can be shown to be optimal in the sense that it achieves the minimum mean squared error of the channel estimation. This is shown in the following. From (2.24), MSE of channel ˆ estimates hC,LS (m) is given by 1 ˆ MSEm = Nt Nr L E{ hC,LS (m) − hC,LS (m) 2 } 1 = 2 EP Nt Nr L E{ (S(P ) (m))H V (P ) (m) 2 } (2.26) 1 = 2 EP Nt Nr L tr{(S(P ) (m))H E[V (P ) (m)V (P ) (m)H ]S(P ) (m)} 2 σn = 2 EP Nt Nr L tr{(S(P ) (m))H INt Nr L S(P ) (m)}. Since S(P ) (m)(S(P ) (m))H = (S(P ) (m))H S(P ) (m) = EP INt Nr L , then MSE achieves its 2 σn minimum as MSEmin = EP . At this point, we can find that the unitary matrix design
  • 67. 54 not only reduces the complexity of the channel estimator, but also ensures that it has the least estimation error, if the pilit-tones have fixed transmit power. 2.4 An Illustrative Example and Concluding Remarks 2.4.1 Comparison With Known Result In this section, we demonstrate the performance of the proposed channel estimation based on our optimal pilot-tone design through computer simulations. In order to have a clear look at the performance improvement, other channel estimation technique [50] is also simulated. We consider a typical MIMO-OFDM system with 2 transmit antennas and 2 receive antennas. The OFDM block size is chosen as N = 128 and a CP with length of 16 is prepended to the beginning of each OFDM symbol. The four sub-channel paths denoted by {h11 , h12 , h21 , h22 } are assumed to be independent to each other and have a CIR with length L = 16 individually. Those CIR coefficients in each sub-channel are simulated by the Jakes’ model [51]. Our simulation is conducted in two ways: • Method I: Place two sets of L = 16 pilot-tones into each OFDM block and the pilot-tones are equally-spaced and equally-powered as shown in Figure 2.3; • Method II: Set the first two OFDM blocks of each data frame, which includes ten OFDM blocks, as preamble. Put L = 16 equally-spaced and equally-powered
  • 68. 55 pilot-tones into each of the first two preamble block and set all the other tones as zeros. (see [50] for detailes). To illustrate the mobile environments, different Doppler shifts are simulated as fd = 5, 20, 40, 100 and 200 Hz. The performance of the system is measured in terms of the MSE of the two different channel estimation schemes mentioned above and the symbol error rate (SER) versus SNR. For a reliable simulation, total 10,000 frames are transmitted for each test. Then the average values of MSE and SER are taken as the measurements. In Figure 2.4, the Doppler shift is 5 Hz and the two curves marked with “known channel” serve as the performance bound since we know the channel state information exactly. This is totally unrealistic and is just for the purpose of comparison. We can find the two curves corresponding to both RX antenna 1 and RX antenna 2 are nearly merged together. This matches our expectation since there is no difference between the two receive antennas statistically. It also can be found that the two curves generated by channel estimation based on our optimal pilot- tone design is close to the performance bound, just a narrow gap between them due to the ever-existing channel estimation error. On the contrast, the two curves generated by channel estimation based on the technique in [50] is far away from the performance limit, even with a large SNR. It justifies our point that the method based on preamble at the beginning of a frame is not applicable to a fast varying wireless channel. Through Figure 2.4 to Figure 2.6, the performance of the system based on the proposed pilot-tone design does not change a lot since it keeps tracking
  • 69. 56 the channel by the pilot-tones in each OFDM block. The difference between the two estimation schemes is illustrated in the MSE plots. In Figure 2.7, for a fixed SNR value, the curves for different Doppler spreads do not change that much and that implies that the method we proposed is able to track the fast time-varying channel. For a specific SNR value, the curves in Figure 2.8 do change along with the different Doppler shifts. It can be seen that the estimation error when fd = 200 Hz is much larger than the one when fd = 5 Hz in Figure 2.8. It indicates that the method based on preambles works poorly when Doppler spread is small, and does not work when the channel is changing quickly. fd=5 Hz 0 10 Rx1 KnownChanel Rx2 KnownChanel Rx1 PilotTone−based Rx2 PilotTone−based Rx1 Preamble−based Rx2 Preamble−based −1 10 Symbol Error Rate −2 10 −3 10 5 10 15 20 25 30 SNR (in dB) Figure 2.4: Symbol error rate versus SNR with Doppler shift=5 Hz
  • 70. 57 fd=40 Hz 0 10 −1 10 Symbol Error Rate −2 10 Rx1 KnownChanel Rx2 KnownChanel Rx1 PilotTone−based Rx2 PilotTone−based Rx1 Preamble−based Rx2 Preamble−based −3 10 5 10 15 20 25 30 SNR (in dB) Figure 2.5: Symbol error rate versus SNR with Doppler shift=40 Hz fd=200 Hz 0 10 −1 10 Symbol Error Rate −2 10 Rx1 KnownChanel Rx2 KnownChanel Rx1 PilotTone−based Rx2 PilotTone−based Rx1 Preamble−based Rx2 Preamble−based −3 10 5 10 15 20 25 30 SNR (in dB) Figure 2.6: Symbol error rate versus SNR with Doppler shift=200 Hz
  • 71. 58 Normalized MSE of Pilot−tone Based Channel Estimator −3 x 10 4 3.5 3 Normalized MSE 2.5 2 1.5 1 0.5 0 200 150 30 25 100 20 50 15 10 0 5 Doppler Shift (in Hz) SNR (in dB) Figure 2.7: Normalized MSE of channel estimation based on optimal pilot-tone design Normalized MSE of Preamble Based Channel Estimator −3 x 10 8 7 6 Normalized MSE 5 4 3 2 1 200 150 30 25 100 20 50 15 10 0 5 Doppler Shift (in Hz) SNR (in dB) Figure 2.8: Normalized MSE of channel estimation based on preamble design
  • 72. 59 2.4.2 Chapter Summary We presented a new optimal pilot-tone design for MIMO-OFDM channel estimation. (P ) Nt sets of L pilot-tones coded in SSF C (m) are transmitted at each antenna simulta- neously and the channel can be estimated optimally. The main advantage is rooted in its ability to handle fast time-varying system since channel can be estimated at each OFDM block and its simpleness since the orthogonal design makes the MIMO system be easily processed in a parallel way. For an Nt × Nr MIMO system, the complexity of any kinds of signal processing algorithms at the physical layer is increased usually by a factor of Nt Nr . To name a few, channel estimation, carrier frequency offset estimation and correction and IQ imbalance compensation all become very challenging in MIMO case. In this chapter, we provide solutions to the following “how” questions. How many pilot tones are needed? How are they placed in one OFDM block? Most importantly, how fast can channel estimation be accomplished? We propose a pilot tone design for MIMO- OFDM channel estimation that Nt disjoint sets of pilot tones are placed on one OFDM block at each transmit antenna. For each pilot tone set, it has L pilot tones which are equally-spaced and equally-powered. The pilot tones from different transmit antennas comprise a unitary matrix and then a simple least square estimation of the MIMO channel is easily implemented by taking advantage of the unitarity of the pilot tone matrix. There is no need to compute the inverse of large-size matrix which is usually required by LS algorithm.
  • 73. 60 In a highly mobile environment, like a mobile user in a vehicle riding at more than 100km/hr, the wireless channel may change within one or a small number of symbols. For example as in [30], in IEEE 802.16-2004 Standard with N = 256, G = 44 (N : FFT size; G: guard interval) and 3.5MHz full bandwidth, the symbol duration is about 73 microseconds. For a user in a vehicle traveling with 100km/hr, the channel coherent time is about 1100 microseconds. That means the wireless channel varies after around 15 symbols. In a real-time communication scenario, the information packet could contain hundreds of data symbols or even more. Our scheme is proposed in this chapter that we distribute the pilot symbols in the preamble to each OFDM block for channel estimation. Since the pilot tones are placed on each OFDM block, the channel state information can be estimated accurately and quickly, no matter how fast the channel condition is varying. It is fair to point out that we may have a higher overhead rate compared to the methods in the literature. Therefore our pilot design can also be applied to a slow time-varying channel by placing pilot tones on every a few number of OFDM blocks. That can reduce the channel throughput loss. The orthogonal pilot tone matrix is indeed a space-frequency code. The row direction of the matrix stands for different pilot tone sets in the frequency domain, and the column direction represents the individual transmit antennas in spatial domain. And it can be readily extended to an Nt × Nr MIMO system by constructing an Nt × Nt orthogonal matrix. With this explicit relation to space-frequency code, the
  • 74. 61 design of pilot tone matrix for MIMO-OFDM channel estimation can be conducted in a more broad perspective.
  • 75. Chapter 3 Wireless Location for OFDM-based Systems 3.1 Introduction Wireless networks are primarily designed and deployed for voice and data commu- nications. The widespread availability of wireless nodes, however, makes it feasible to utilize these networks for wireless location purpose as an alternative to the GPS (global positioning system) location service. It is expected that location-based ap- plications will play an important role in future wireless markets. The commercially available location technology is implemented on cellular networks and WLAN, such as E911 (Enhanced 911) and indoor positioning with WiFi (wireless fidelity). In this dissertation, we are investigating wireless location technology aimed at a different network, i.e., WiMax system. 3.1.1 Overview of WiMax WiMax is an acronym for Worldwide Interoperatability for Microwave Access. It is not only a technical term indicating a new wireless broadband technology, but also is 62
  • 76. 63 referred to as a series of new products working on this network. The real WiMax-based wireless gears do not come to the market yet. But people are already very familiar with the WiFi-based products such as notebook wireless cards and wireless routers from Linksys, D-Link and Belkin, while they are checking their emails or surfing on Internet wirelessly on campus or at airports, hotels, bookstores and coffee shops. WiFi stands for Wireless Fidelity and it is the first available technology for WLAN and wireless home networking. However it is constrained by its limited coverage of about 50-100 meters and relatively low data rate. Different from WiFi, WiMax is another new broadband wireless access technology that provides very high data throughput over long distance in a point-to-multipoint and line of sight (LOS) or non-line of sight (NLOS) environments. In terms of the coverage, WiMax can provide seamless wireless services up to 20 or 30 miles away from the base station. It also has an IEEE name 802.16-2004. It is this IEEE standard that defines the specifics of air interface of WiMax. WiMax Standards Actually microwave access is not a new technology for broadband systems. Propri- etary point-to-multipoint broadband access products from companies like Alcatel and Siemens have existed for decades. They did not get their popularity because they are extremely proprietary. Today’s WiMax is attempting to standardize the technology to reduce the cost and to increase the range of applications. The current standard for WiMax is IEEE std 802.16-2004. It can be easily downloaded at IEEE website. With
  • 77. 64 its approval in June 2004, it renders the previous standard IEEE std 802.16-2001 and its two amendments 802.16a and 802.16c obsolete. Now IEEE 802.16-2004 can only address the fixed broadband systems. IEEE 802.16 Task Group e is working on an amendment to add mobility component to the standard. The new standard may be named as IEEE 802.16e. WiMax Applications We have seen a lot of marketing efforts on WiMax applications at conferences, exhi- bitions and other media. People are wondering if it is a must technology in the near future. Let’s have a look at the fact that what kind of broadband services we can have today. We usually resort to a landline connection with T1, DSL and cable modems. WiMax or 802.16 is proposed to address the first mile/last mile wireless connection in a metropolitan area network. It can change the last-mile connection as much as 802.11 did for the change of the last hundred feet connection. It may change not only for the rural areas, but also for anyplace where the cost of laying or upgrading landline to broadband capabilities is prohibitively expensive. WiMax’s primary use will most likely come in the form of metropolitan area network. In terms of services and applications, it is different from the traditional WiFi standards which include 802.11a, 802.11b and 802.11g. The WiFi technology with a maximum range of 800 feet outdoors mainly intend to be used in local area networks to provide services for residential homes, for public hot spots like airports, hotels and coffee shops, and for small business buildings. With its much longer range, in theory WiMax can reach a
  • 78. 65 maximum of 31 miles, and WiMax can provide broadband services to thousands of homes in a metropolitan area. Imagine that a broadband service provider can serve thousands of residential homes, small and large scale business buildings without the cost of laying out physically running lines and dispatching the technicians for instal- lations and maintenance of the lines. The savings will push them to choose WiMax and to reduce the charge fees for their customers. Another driving force for WiMax is its speed. It can transfer the data with a rate up to 70 Mbps which is equivalent to almost 60 T1 lines. Combining its long range with the high-speed, it is why the application of WiMax is endless. All of these sound great enough though, the real WiMax products are not commercially available in the market yet. There are only some pre-WiMax products based on the standard coming up. But it will come soon. For example, Intel’s PRO/Wirelss 5116 is a highly integrated IEEE std 802.16-2004 compliant system on chip for both licensed and license-exempt radio frequencies. 3.1.2 Overview to Wireless Location System Wireless location refers to determination of the geographic coordinates, or even the velocity and the heading in a more general sense, of a mobile user/device in a cel- lular, WLAN or GPS environments. Usually wireless location technologies fall into two main categories: handset-based and network-based. In handset-based location systems [55], the mobile station equipped with extra electronics determines its lo- cation from signals received from the base stations or from the GPS satellites. In GPS-based estimations, the MS (mobile station) receives and measures the signal
  • 79. 66 parameters from at least four satellites of a currently existing constellation of 24 GPS satellites. The parameter of which the MS measures is the time for each satellite signal to reach the MS. GPS systems have a relatively higher degree of accuracy and they also provide global location information. However, embedding a GPS receiver into mobile devices leads to increased cost, size and battery consumption. It also re- quires the replacement of millions of mobile handsets that are already in the market with new GPS-featured handsets. In addition, the accuracy of GPS measurements degrades in urban and indoor environments. For these reasons, some wireless carriers may be unwilling to embrace GPS fully as the only location technology. On the other hand, network-based location technology relies on the ever existing network infrastructures to determine the position of a mobile user by measuring its signal parameters when received at the network BSs (base stations). This may require some hardware upgrade or installation at the BSs, but the cost can be shared by a huge number of mobile subscribers and it does not affect the users in using their mobile devices. In this technology, the BSs measure the signal transmitted from an MS and relay them to a central site for further processing and data fusion to provide an estimate of the MS location. Network-based technologies have the significant advantage that the MS is not involved in the location-finding process; thus the technology does not require modifications to existing handsets. However, unlike GPS location systems, many aspects of network-based location are not yet fully studied. In Figure 3.1, network-based wireless location technology is illustrated.
  • 80. 67 3 T(D)OA/AOA Estimator r3 2 T(D)OA/AOA Estimator BS3 (x3, y3 ) r2 MS r1 1 BS 2 ( x2, y2 ) T(D)OA/AOA Estimator BS1 ( x1, y1 ) Data Fusion Center Figure 3.1: Network-based wireless location technology (outdoor environments) Network-based wireless location technology gains more recognition with the in- creasing number of wireless subscribers and the demands for some location-oriented services such as E911. It is estimated [56, 57] that location based service will generate annual revenues in the order of $ 15B worldwide. In U.S. alone, about 170 million mobile subscribers are expected to become covered by the FCC mandated location accuracy for emergency services. The following is a partial list of applications that will be enhanced by using wireless location information [58]. • E911. Nowadays a high percentage of E911 calls are generated from mobile phones; the percentage is estimated [59, 60] to be at one third of all 911 calls (170,000 per day). These wireless 911 calls do no receive the same quality of emergency assistance as those fixed-network 911 calls enjoy. This is due to
  • 81. 68 the unknown position of the wireless 911 caller. To fix this problem, FCC issued an order on July 12, 1996 [59], requiring all wireless service providers to report accurate MS location to the E911 operator at the PSAP (public safety answering point). In the FCC order, it was mandated that within five years from the effective date of the order, October 1, 1996, wireless service providers must convey to the PSAP the location of the MS within 100 meters of its actual position for at least 67 percent of all wireless E911 calls. This FCC order has motivated considerable research efforts towards developing accurate wireless location algorithms for cellular networks and has led to significant enhancement to the wireless location technology. • Mobile advertising. Location-specific advertising and marketing will benefit once the location information is available. For example, stores would be able to track customer locations and to attract them in by flashing customized coupons on their wireless devices [61]. In addition, a cellular phone or a PDA (personal digital assistant) could act as a smart handy mobile yellow pages on demand. • Asset tracking (indoor/outdoor). Wireless location technology can also assist in advanced public safety applications such as locating and retrieving lost children, patient, or even pets. In addition, it can be used to track per- sonnel/assets in a hospital or a manufacturing site to provide a more efficient management of assets and personnel. One could also consider application such as smart and interactive tour guides, smart shopping guides that lead shoppers
  • 82. 69 based on their location in a store, smart traffic control in parking structures that guides cars to free parking slots. Department stores, enterprises, hospitals, manufacturing sites, malls, museums, and campuses are some of the potential end-users to benefit from the technology. • Fleet management. Many fleet operators, such as police force, emergency vehicles, and other services including shuttle and taxi companies, can make use of the wireless location technology to track and operate their vehicles in an efficient way in order to minimize the response time. In addition, a large number of drivers on roads and highways carry cellular phones while driving. The wireless location technology can help track these phones, thus transforming them into sources of real-time traffic information that can be used to enhance transportation safety. • Location-based wireless security. New location-based wireless security schemes can be developed to add a level of security to wireless networks against being intercepted or hacked into. By using location information, only people at certain specific areas could access certain files or databases through a WLAN. • Location sensitive billing. Using the location information of wireless users, wireless service providers can offer variable-rate call plans or services that are based on the caller location.
  • 83. 70 3.1.3 Review of Data Fusion Methods We assume that the location is specified by (x, y) for simplicity. As shown in Figure 3.1, data fusion center is to determine the mobile user location by exploring all the estimated signal parameters from BSs. The most common signal parameters are time, angle and amplitude of arrival of the MS signal. Therefore, different data fusion algorithms are proposed accordingly. The materials in this section are mainly based on the survey paper in [53]. • Time. By combining the estimates of the TOA (time of arrival) of the MS signal when received at the BSs, the MS location can be determined in a wireless network with three or more BSs. It is illustrated in Figure 3.3. Without loss of BS 3 BS 2 ( x3 , y3 ) ( x2 , y 2 ) r3 r2 ( xT , yT ) MS r1 BS1 (0, 0) Figure 3.2: TOA/TDOA data fusion using three BSs generality, the geometric coordinate of BS1 is assumed to be (0, 0). The location
  • 84. 71 of other BSs are denoted by (xk , yk ), k = 2, 3. Obviously x1 = y1 = 0. Since the radio signal travels at the speed of light (c = 3 × 108 m/s), the distance between the MS and BSk is given by Rk,T = (tk − to )c, (3.1) where to is the time instant when the MS starts transmitting signal and tk is the time of arrival of the MS signal at BSk . The distances {Rk,T }3 can be used k=1 to estimate the MS location (xT , yT ) by solving the following set of equations R1,T = x2 + yT 2 T 2 2 R2,T = (x2 − xT )2 + (y2 − yT )2 (3.2) 2 R3,T = (x3 − xT )2 + (y3 − yT )2 . To solve the above overdetermined nonlinear system of equations, we can refor- mulate (3.2) into an LS-type presentation by subtracting the first equation from the second and the third equations respectively. Hence the following equation is obtained 2 2 R2,T − R1,T = x2 + y2 − 2(x2 xT + y2 yT ) 2 2 (3.3) 2 2 R3,T − R1,T = x2 + y3 − 2(x3 xT + y3 yT ). 3 2 In a matrix form, it can be rewritten as      2 2 2  x2 y2   xT  1  − R2 −(R2,T R1,T )   =  , (3.4) x3 y3 yT 2 R2 − (R2 − R2 ) 3 3,T 1,T where Rk = x2 + yk is the distance of the base station BSk to the origin point k 2 2 in the coordinate, and clearly R1 = 0. If we have more than three BSs, a compact form can be obtained in a similar way as b = Aθ, (3.5)
  • 85. 72 where     2 2 2  R2 − (R2,T − R1,T )   x2 y 2         2 2  2    R3 − (R3,T − R1,T )   x3 y 3  x 1  b= 2 ; A =  ; θ =  T .       R2 − (R2 − R2 )   x y4  yT  4 4,T 1,T   4   .   .  . . . . A standard LS estimation of θ is given by ˆ θ = (AT A)−1 AT b. (3.6) 2 Note that R1,T is a function of xT and yT as defined in 3.2. Hence (3.6) only provides an intermediate solution and the estimates xT and yT can be obtained ˆ ˆ by solving the resultant quadratic equation. And clearly the TOA data fusion method requires perfect timing between the MS and the BSs since a small offset of a few microseconds between the MS clock and the BS clock will reflect into hundreds of meters of errors in location estimate. But the current wireless network standards only mandate tight timing synchronization among BSs [62]. The accuracy of TOA method is heavily dependent on the timing between BS and MS. There is another alternative of using the TDOA (time difference of arrivals) which help avoid the MS clock synchronization problem. Define the TDOA associated with the base station BSk as ∆tk,1 = tk −t1 , i.e., the difference between the TOA of the MS signal at the BS BSk and BS1 . Then the difference between Rk,T and R1,T can be related to ∆tk,1 as ∆Rk,1 = Rk,T − R1,T = (tk − to )c − (t1 − to )c (3.7) = ∆tk,1 c.
  • 86. 73 Clearly it is seen that the possible timing error on the MS clock to is canceled out. This insensitivity to to gives TDOA method the advantage over TOA. By substituting Rk,T = (R1,T + ∆Rk,1 )2 in (3.2) and rearranging some terms, we 2 can obtain the following LS expression for any number of base stations as R1,T c + d = Aθ, (3.8) where     2 2  −∆R2,1   − R2 ∆R2,1         2 2   −∆R3,1   R − ∆R3,1  c= ; d = 1  3 .   2   −∆R   R2 − ∆R2   4,1   4 4,1   .   .  . . . . Notice that R1,T = x2 +yT is not known and hence only an intermediate solution T 2 can be obtained from the above LS formulation ˆ θ = (AT A)−1 AT (R1,T c + d). (3.9) Since ˆ 2 2 θ = R1,T , (3.10) we can substitute (3.9) into (3.10) and solve R1,T from the resulting quadratic ˆ equation. A final solution for θ can be subsequently obtained by substitute the positive root of the quadratic equation into (3.9). • Angle. The AOA (angle of arrival) can be obtained at a BS by using an an- tenna array. The direction of arrival of the MS signal can be calculated by measuring the phase difference between the antenna array elements or by mea- suring the power spectral density across the antenna array in what is known
  • 87. 74 as beamforming [64]. Intuitively, the MS location can be estimated by com- bining the AOA estimates from two BSs as shown in Figure 3.3. Compared BS2 ( x2 , y 2 ) 2 R2 R2,T MS ( xT , yT ) R1,T 1 1 (0,0) BS1 Figure 3.3: AOA data fusion with two BSs to TOA/TDOA methods, the number of BSs needed for location is relatively smaller and there is no need for timing synchronization between base stations and MS clocks. However, one disadvantage is that antenna array used at the BS which is not available in 2G systems. It is planned for 3G cellular systems such as UMTS and CDMA2000 [65, 66]. As indicated in Figure 3.3, we have            xT   R1,T cos(β1 )   xT   x2   R2,T cos(β2 )   = ;  = + , (3.11) yT R1,T sin(β1 ) yT y2 R2,T sin(β2 ) where 2 2 R2,T = R1,T + R2 − 2R1,T R2 cos(α1 − β1 ) = f (α1 , β1 , R1,T , R2 ).
  • 88. 75 Since α1 , β1 and R2 is known, we simply denote R2,T as a function of R1,T as R2,T = f2 (R1,T ). If there are more than two BSs, an LS formulation can be obtained by collecting the relations in (3.11) into a single equation as b = Aθ, (3.12) where      R1,T cos(β1 )   1 0            R1,T sin(β1 )    0 1        b =  R2 + f2 (R1,T ) cos(β2 )  ; A =  1 0 .             R2 + f2 (R1,T ) sin(β2 )   0 1       .   .  . . . . The LS solution for x is then ˆ θ = (AT A)−1 AT b. (3.13) Since this intermediate solution involves the unknown R1,T , we have to utilize the relation in (3.10) to get the positive root of the quadratic equation and then ˆ substitute R1,T back to (3.13) for a final solution of θ. • Amplitude. Amplitude-based wireless location technology is mainly used in indoor environments where WLAN standards such as 802.11a and 802.11g have been widely adopted. The WLAN connectivity has also become a standard feature for laptop computers and PDAs. As such, there is an increasing interest in utilizing these networks for location purposes to help provide a good coverage for indoor scenario. In 802.11b and 802.11g MAC layer, the information about
  • 89. 76 the signal strength and the signal-to-noise ratio is provided. Hence, a software- level location technique could be developed for WLAN networks based on the amplitude of arrival of the MS signal at different access points [67, 68, 69]. Specifically, when an IEEE 802.11 networks operate in the infrastructure mode, there are several APs (access point) and many end users within the network. RF-based systems that use the signal strength for location purposes can monitor the received signal strength from different APs and use the obtained statistics to build a conditional probability distribution network in order to estimate the location of the mobile client. These schemes usually work in two stages. The first stage is the offline training and data gathering phase and the second stage is the location determination phase using the online signal strength measurements. In the training phase, signal strength measurements are used to build an a priori probability distribution of the received signal strength at the mobile user from all APs. Assume there are Na APs in the system and the radio map is created based on measurements from Nu user locations. It is illustrated in Figure 3.4. The radio map model is described by [67]. Define p(Ai | xj , yj ) as the probability density function of the received signal strength from the i-th AP at the j-th measurement point (xj , yj ). After constructing a Bayesian network, the online determination phase uses maximum likelyhood estimation to locate the mobile user. Thus assume that the mobile user measures the received signal strength from all APs, say Ai , i = 1, 2, . . . , Na . Then by Bayes’ rule, the probability of
  • 90. 77 p( Ai | x4 , y4 ) AP1 p ( Ai | x3 , y3 ) p ( Ai | x5 , y 5 ) AP6 p( Ai | x6 , y6 ) p( Ai | x2 , y2 ) AP7 p( Ai | x7 , y7 ) p ( Ai | x1 , y1 ) AP5 AP2 AP4 p ( Ai | x10 , y10 ) p ( Ai | x8 , y8 ) AP3 p ( Ai | x9 , y9 ) Figure 3.4: Magnitude-based data fusion in WLAN networks having the mobile user at location (xj , yj ) given the received signal strengths from all APs is given by A = [A1 , . . . , ANa ]T p(A|xj ,yj )p(xj ,yj ) p(xj , yj | A) = p(A) (3.14) Na p(xj ,yj ) i=1 p(Ai |xj ,yj ) = p(A) , Na where i=1 p(Ai | xj , yj ), 1 ≤ j ≤ Nu is the approximation for the conditional probability density function of the received signal strength when the location of the mobile is given. Thus the location of the mobile user can be estimated as (ˆT , yT ) = arg max p(xj , yj | A) x ˆ 1 ≤ j ≤ Nu . (3.15) xj ,yj We note that the location problem has been tackled by the LS approach as above. See also [53] for more details. However several problems exist. The first one is that
  • 91. 78 it is unclear for the physical meaning of these LS solutions, because of the lack of the statistical information on the measurements of the TOAs, TDOAs, AOAs and amplitudes, and the impact in transforming the nonlinear estimation for wireless location into quasi-linear estimation. This problem will be investigated in this thesis for location based on TDOAs and AOAs. The second one is the nuisance variables Rk,T , the distance from the k-th BS to the MS which is really unknown. Although we can use roots solving method, it works only if no noise is involved in measurement data and often no positive real roots exist. We will convert it into a constrained LS problem and provide a solution algorithm in this thesis. The final problem is location using more than one type of measurements. Because of the timing difficulty and lack of training, we will consider only measurements of TDOAs and AOAs for wireless location. 3.2 Least-square Location based on TDOA/AOA Estimates 3.2.1 Mathematical Preparations Estimation problem, simply speaking, is to guess what you do not know base on what is given to you. In terms of its mathematical fundamentals, it is to estimate the unknown parameters based on some observation data by using some criteria which leads to an optimal estimator. The observation data usually is a function of the unknown parameters, either a linear function or a nonlinear one. For simplicity, let’s
  • 92. 79 begin with a generic linear model as follows: Z = Hθ + V . (3.16) In this model, Z, of size N × 1, is called the measurement vector ; θ, of size n × 1, is called the parameter vector ; H, of size N × n, is called the observation matrix and V , the same size as Z, is called the measurement noise vector. Because V is random,Z is random too. Both H and θ can be either deterministic or random. This is determined by the specific applications. Because of the simplicity, linear models are widely used in practice. Even in the case of nonlinear models, quasi-linear models that are close to nonlinear models are often pursued as in this thesis. Here a question follows the linear model above: “How can we have the best esti- mate of θ if we only know Z?” This can be viewed as that we have made N times of independent experiments in order to estimate θ, which is composed of n unknown elements {θ1 , θ2 , . . . , θn }, where n < N . Inevitably, the experiment data is corrupted by some noise which is usually assumed to be additive Gaussian. To answer the question, there are generally three types of criteria to seek for the best estimate of θ in the field of statistical signal processing. They are weighted least-square estima- tion (WLSE), minimum mean square estimation (MMSE) and maximum-likelihood estimation (MLE). ˆ 1. WLSE: It is the simplest method with the oldest history. The best estimate θ can be obtained by minimizing the cost function ˆ ˆ ˆ J[θ] = [Z − Hθ]T W[Z − Hθ], (3.17)
  • 93. 80 where W = WT > 0 is the weighting matrix. 2. MMSE: The optimal estimate minimizes the error variance. Given the mea- surements {Z(i)}N , we shall determine an estimate of θ i=1 ˆ θ = f [z(1), z(2), . . . , z(N )] (3.18) such that the mean squared error ˆ ˆ ˆ J[θ] = E[θ − θ]T [θ − θ] (3.19) is minimized. 3. MLE: It aims to maximize the likelihood function. Suppose that the measure- ˆ ment data {Z(i)}N are jointly distributed with a density function p(Z; θ). The i=1 optimal estimate is given by ˆ ˆ θopt = arg max p(Z; θ). (3.20) ˆ θ ˆ It is usually a nonlinear estimator since the likelihood function p(Z; θ) is non- linear with respect to θ(k). Hence the computational load could be high. Then, how do we know whether or not the result obtained from one particular method is good? Or why is it better than other methods? We learn that, to answer this question, we must make use of the fact that all estimators represent transformations of random data and hence the estimate itself is random so that its properties must be studied from a statistical viewpoint. In this section, we introduce some fundamental
  • 94. 81 concepts such as unbiased estimator and efficient estimator, Cramer-Rao bound and Fisher information matrix [72]. Definition 3.1 (Unbiasedness [72] ) Suppose that the parameter vector θ is deter- ˆ ˆ ministic. An estimator θ is unbiased if E{θ} = θ. An unbiased estimate indicates that its mean value is the same as the true parameter vector. Hence as the number of observation increases, the estimate is assured to converge to the true parameter. However the unbiasedness itself is not adequate. We must study the dispersion about the mean, the variance of the estimator. Ideally, we would like our estimator to be unbiased and to have the smallest possible error variance. ˆ Definition 3.2 (Efficiency [72] ) An unbiased estimate, θ of vector θ is said to be ˜ more efficient than any other unbiased estimator, θ, of θ, if ˆ ˆ ˜ ˜ E{[θ − θ][θ − θ]T } ≤ E{[θ − θ][θ − θ]T }. (3.21) A more efficient estimator has the smallest error covariance among all the unbiased ˆ ˆ ˜ ˜ estimators of θ, “smallest” in the sense that E{[θ − θ][θ − θ]T } − E{[θ − θ][θ − θ]T } is negative semidefinite. Normally it does not make much sense to compare each pair of unbiased estimators. A lower bound, called CRB (Cramer-Rao Bound), about the minimum error variance achievable over all unbiased estimates exists and the efficiency of an unbiased estimator can be used to measure by how close it is to the CRB. The following theorem presents the CRB.
  • 95. 82 Theorem 3.1 (Cramer-Rao Bound [72] ) Let Z denote a set of N observation data, i.e., Z = [z(1), z(2), . . . , z(N )]T which is characterized by the probability density func- ˆ tion p(Z; θ) = p(Z). If θ is an unbiased estimate of the deterministic θ, then the error ˆ ˆ convariance matrix, E{[θ − θ(k)][θ − θ(k)]T }, is bounded from below by ˆ ˆ E{[θ − θ(k)][θ − θ(k)]T } ≥ J−1 , (3.22) where J is the Fisher information matrix, defined by    ∂ T ∂ J=E ln p(Z(k)) ln p(Z(k)) , (3.23)  ∂θ ∂θ  which can also be expressed equally as ∂2 J = −E ln p(Z(k)) . (3.24) ∂θ2 Note that, for the theorem to be applicable, the vector derivatives in (3.23) must exist and the norm of ∂p(Z)/∂θ must be absolutely integrable. Clearly, to compute the Cramer-Rao lower bound, we need to know the probability density function p(Z). Often the exact information on p(Z) is not available, for which we cannot evaluate this bound. However, in the case of normal distribution, i.e., 1 [Z−µ]T C −1 [Z−µ] p(Z; θ) = e− 2 , (3.25) (2π)N/2 |C| 1/2 where µ and C are, respectively, the mean and the convariance matrix of Z. Then we can compute the Cramer-Rao bound corresponding to the Gaussian data distribution by the Slepian-Bangs formula [74]   T 1 ∂C −1 ∂C ∂µ ∂µ  [J−1 ]ij = tr C−1 C + C−1 . (3.26) 2 ∂θi ∂θj ∂θi ∂θj
  • 96. 83 Because of the central limit theorem, Gaussian distribution holds approximately in applications such as location estimation. 3.2.2 Location based on TDOA In this section, we investigate location estimation algorithms based on the measure- ments of TDOA and AOA. For simplicity, we assume that the mobile users travel at a low speed and can be taken as stationary targets approximately. Hence we do not consider the estimation of velocity of mobile users. Basically we explore all the available measurements {∆tk,1 }Nb (TDOA data) and {βk }Nb (AOA data), where Nb k=2 k=1 is the total number of base stations to determine the location of the mobile user or the target, i.e., (xT , yT ). It is seen that we consider only two-dimensional localization that is adequate, if the terrain elevation is known a priori or it could be neglected compared to the heights of the antenna towers. We start with stationary target estimation based on the measurements of TDOA. As defined in section 3.1.3, Rk,T = (xT − xk )2 + (yT − yk )2 ∆tk,1 = (Rk,T − R1,T )/c (3.27) = ( (xT − xk )2 + (yT − yk )2 − x2 + yT )/c. T 2 Besides the measurements {∆tk,1 }Nb , the locations of all the base stations {(xk , yk )}Nb k=2 k=1 are also assumed to be known. Clearly ∆tk,1 is a nonlinear function of the un- known (xT , yT ), i.e., ∆tk,1 (xT , yT ). Here, for brevity of notation, (xT , yT ) is omitted in TDOAs unless it is needed for clarification. ˆ ˆ ˆ For all the TDOA measurements {∆t2,1 , ∆t3,1 , . . . , ∆tNb ,1 }, it is unavoidable that
  • 97. 84 there are measurement noises embedded within the data. Therefore, the measurement data are described by ˆ ∆tk,1 = ∆tk,1 + δtk , (3.28) where {δtk }Nb are assumed to be i.i.d. (independent and identical distributed) Gaus- k=2 2 sian random variables with zero mean and variance σt . It is an important but fair assumption given the fact that all the BSs are well synchronized and it is much less likely that a large deviation from the mean occurs. Since δtk is a Gaussian random ˆ variable, so is ∆tk,1 . Based on the above assumption, we can define the (Nb − 1) × 1 ˆ multivariate Gaussian random variable vector ∆t and the associated mean m∆t and ˆ covariance matrix M∆t respectively as ˆ      ˆ ∆t2,1   ∆t2,1      . . ∆t =  ˆ  . .  ; m∆t =  ˆ  . .  2  ; M∆t = σt I(Nb −1) . ˆ (3.29)         ˆ ∆tNb ,1 ∆tNb ,1 ˆ As shown in [71], the joint PDF for ∆t is given by ˆ p(∆t) = √ 1√ ˆ ˆ ˆ ˆ exp[− 1 (∆t − m∆t )T M−1 (∆t − m∆t )] ˆ ( 2π)Nb −1 det M∆t 2 ∆t ˆ ˆ (3.30) 1 Nb (∆tk,1 −∆tk,1 (xT ,yT ))2 = √ N −1 exp[− k=2 2σt2 ]. ( 2π)Nb −1 σt b This joint Gaussian PDF can completely describe the statistical characteristics of the measurement data and itself is affected by the two unknowns xT and yT . With a fixed data set of measurements, there must be only one pair of (xT , yT ) such that the set of data is the most likely to occur. In light of the estimation theory, maximum- likelihood (ML) method can be explored to estimate the target location (xT , yT ). Before providing the ML estimator, as shown in Theorem (3.1), we would like to
  • 98. 85 compute the Fisher information matrix and the Cramer-Rao bound such that we know how close the estimation can be. The Cramer-Rao bound is a benchmark for evaluating different types of unbiased estimators. Let P and JFIM denote the estimation error convariance matrix and the Fish information matrix. It holds for any type of unbiased estimator [72] that P ≥ J−1 . FIM (3.31) According to the Slepian-Bangs formula, the Fisher information matrix based on (3.30) can be calculated by 1 ∂M∆ˆ −1 ∂M∆ˆ ∂m ˆ ∂m ˆ Jtdoa = [ tr{M−1 ∆ˆ ∂χ t t M∆ˆ t ∂χ t } + ( ∆t )T M−1 ( ∆t )]2,2 , ∆ˆ ∂χ t i,j=1,1 (3.32) 2 i j ∂χi j 2 2 where χ1 = xT and χ2 = yT . Since M∆t = σt INb −1 is only related to σt , the first ˆ term in (3.32) is zero. By direct calculations, we have     1 xT −x2 xT 1 yT −y2 yT  ( c R2,T − )  R1,T  ( c R2,T − R1,T )       1 ( xT −x3 − R1,T )  xT  1 ( yT −y3 − Ry1,T )  T ∂ m∆t   ∂ m∆t   ∂χ1 ˆ =  c R3,T .  ;  ∂χ2 ˆ =  c R3,T .   ; M−1 = 12 IN −1 .  ˆ σt  . .   . .  ∆t b         1 xT −xNb xT 1 yT −yNb yT ( c RNb ,T − R1,T ) ( c RNb ,T − R1,T ) (3.33) Then it is easy to obtain Jtdoa as the follow by substituting (3.33) into (3.32), ∂ m∆t T ∂m ˆ Jtdoa = [( ∂χi ˆ ) M−1 ( ∂χ∆t )]2,2 ˆ ∆t i,j=1,1  j  Nb xT −xk xT 1  Rk,T − R1,T  xT −xk yT −yk yT = xT 2σ2   − , − k=2 c t yT −yk yT Rk,T R1,T Rk,T R1,T Rk,T − R1,T   Nb 1  cos(βk ) − cos(β1 )  = 2 2   cos(βk ) − cos(β1 ), sin(βk ) − sin(β1 ) , k=2 c σt sin(βk ) − sin(β1 ) (3.34)
  • 99. 86 where {βk }Nb are shown in Figure 3.3 with tan(βk ) = (yT − yk )/(xT − xk ). By taking k=1 an inverse of the Fisher information matrix Jtdoa , the resultant matrix will be a lower bound of estimation error covariance for all the unbiased estimators. In terms of the large-sample property, the ML estimate approaches the Cramer- Rao bound asymptotically, i.e, with an infinite number of data measurements. From (3.30), the ML location estimator seeks (xT , yT ) to minimize the log-likelihood func- tion of the form Nb 2 L∆t (xT , yT ) = ˆ c∆tk,1 − (xT − xk )2 + (yT − yk )2 + x2 + y T . T 2 (3.35) k=2 This is obtained by using the fact that e−x is a monotonically decreasing function and scaling with a constant c2 σt does not affect the likelihood function. There are two 2 unknowns in L∆t (xT , yT ), namely xT and yT . Differentiating L∆t (xT , yT ) with respect to each and equating the resulting partial derivatives to zero gives the following necessary condition for the optimal solution (x∗ , yT ) T ∗     Nb xk −x∗ x∗  T + T   0   Rk,T ∗ R1,T ∗ ˆ  (c∆tk,1 − Rk,T + R1,T ) =  . (3.36) yk −yT yT k=2 Rk,T + R1,T 0 The ML estimator is well studied and widely used in practice, especially in some applications which require high accuracy of estimation and computational complexity can be afforded via commercially available hardware and software. It has a variety of statical properties which is preferred in applications: • It is unbiased: the expectation of the estimate is equal to the real value; • It is the most efficient: it achieves the minimum error variance;
  • 100. 87 • It is consistent: it converges to the real value in probability. Hence it is plausible to apply ML to our estimation problem for the highest pos- sible accuracy of localization. However, solving the optimal solution (x∗ , yT ) from T ∗ (3.35) and (3.36) is not easy and involves nonlinear procedures such as Newton-type algorithms which are not discussed in this dissertation. The maximization of the likelihood function can be done by hands with some PDFs and even the commercial software does not guarantee to reach the ML solution because of the possible exis- tence of the local minimum. In this thesis we take a quasi-linear approach as in [54] to convert the nonlinear optimization problem into a linear one that leads to an LS- type problem in order to simplify the solution algorithm. Or we can use the LS-type solution as an initial solution candidate in the Newton-type iterative algorithms to ensure the fast convergence to the true ML solution (x∗ , yT ). For this purpose of T ∗ bypassing the difficulty and complexity of the original ML estimator, we notice that the second equation in (3.27) leads to 2 (xT − xk )2 + (yT − yk )2 = x2 + yT + c∆tk,1 T 2 . (3.37) By expanding and rearranging the terms, the above can be written as   1 2 2  xT  2 2 R = 2 2 k xk yk   + ∆tk,1 + R1,T ∆tk,1 . (3.38) c c yT c Packing all the equations in (3.38) for k = 2, 3, . . . , Nb yields         2 R2   ∆t2    x2 y2   2,1   ∆t2,1  1   2  .  xT   2   . .   .    .   .   . = 2 . . . .  + . . +  . .  R1,T . (3.39) c2   c     c      yT     2 RNb xNb yNb ∆t2 b ,1 N ∆tNb ,1
  • 101. 88 If we have the perfect TDOA information, the target (xT , yT ) is uniquely located with any 2 out of the Nb − 1 sets of data since it is an over-determined problem. To estimate the target location (xT , yT ) in (3.39), however, we have to replace the ˆ perfect time difference ∆tk,1 with the available TDOA measurements ∆tk,1 . It then ˆ introduces a noise vector as follows, since ∆tk,1 = ∆tk,1 + δtk .          η2   δt2   ˆ ∆t2,1 δt2   δt2 2    2 .       .     .   .   . =−  .  R1,T − 2  . + . . (3.40)  .  c .   .   .          ηNb δtNb ˆ ∆tNb ,1 δtNb δt2 b N Each element of the noise vector is composed of the TDOA measurement noise δtk and the corresponding squared term. Taking expectation at both sides of (3.40), we 2 find that each element of the noise vector is with mean σt . In an effort to obtain an LS-type formulation, we define 2 ˆk,1 ak = Rk /c2 − ∆t2 − σt , 2 ˆ bk = 2∆tk,1 /c. (3.41) We can regard {ak } and {bk } as pseudo-measurements that leads to a constrained linear model:           2  a2   b2   x2 y2   η2 − σt      2  .  xT    .   .   .    .   . − .  R1,T = 2  . . + . (3.42)  .   . . . .    c           yT   2 aNb bNb xNb yNb ηNb − σt where the constraint is R1,T = x2 + yT . It is worth noting that the composite-noise T 2 {ηk }Nb are not Gaussian random variables or to say, not in normal distribution. But k=2 if {ηk }Nb are Gaussian then the ML algorithm for location estimation is equivalent k=2 to a weighted LS problem involving a constraint. As stated in Corollary 11-1 of [72],
  • 102. 89 ML, LS and BLUE (Best Linear Unbiased Estimator) algorithms are all equivalent for a generic linear model with additive Gaussian noise term. By defining         2  a2   b2   x2 y2   η2 − σt           .   .   . .   .  a =  . ;  .  b =  . ;  .  H1 = c22  .  . . . ;  η1 =   . . .          2 aNb bNb xNb yNb ηNb − σt (3.43) we can rewrite (3.42) into a more compact quasi-linear form: a − bR1,T = H1 θ + η1 . (3.44) The above expression is very similar to a generic linear model of the standard LS al- gorithm except that the pseudo-measurements vector a − bR1,T involves one unknown R1,T = x2 +2 . Fortunately we have an extra condition that helps to solve R1,T . T T H1 is deterministic and η1 is a non-Gaussian vector but whose elements all have zero 2 mean. Let W1 (R1,T ) be a diagonal matrix with elements E{|ηk − σt |2 }. Set 1 T −1 J1 = a − bR1,T − H1 θ W1 (R1,T ) a − bR1,T − H1 θ (3.45) 2 as the objective function to be minimized. Then it is well known that the minimizer is the ML solution provided that the noise vector is Gaussian with W1 (R1,T ) as the covariance matrix. The weighted LS solution can be easily obtained as ˆ θ = (HT W1 (R1,T )H1 )−1 HT W1 (R1,T ) a − bR1,T = Φ1 (R1,T ) a − bR1,T , (3.46) 1 −1 1 −1 −1 −1 ˆ where Φ1 (R1,T ) = (HT W1 (R1,T )H1 )−1 HT W1 (R1,T ). Here θ is an intermediate 1 1 solution since R1,T is unknown. By taking norm square on both sides, it yields 2 2 R1,T = Φ1 (R1,T ) a − bR1,T . (3.47)
  • 103. 90 If one of the roots from such a nonlinear equation is real and positive of which the one yielding the smallest J1 is the optimal solution to the constrained LS problem. It is commented that we convert the ML estimation problem into an LS-type estimation by replacing the perfect TDOA information with measurement data and the equivalence between the LS-type solution and ML estimator can be further established based on the assumption that the composite noise vector is Gaussian. If the noise vector in (3.40) is not exactly Gaussian, the constrained LS solution is not the ML solution either. It seems that we overemphasized the simplicity that LS-type algorithm may have and sacrificed the accuracy of estimation. However it is not too far away from the true ML solution under some mild conditions as shown below. Let X be a Gaussian random variable with zero mean and variance σ 2 . Then the high-order moments of X is given by [73] E{X 2n } = 1 × 3 × 5 × · · · × (2n − 1)σ 2n ; E{X2n−1 } = 0. where n > 0 an integer. Let Y = αX + (X 2 − σ 2 ). Then E{Y } = 0 and σY = E{Y 2 } = α2 σ 2 − σ 4 + E{X 4 } = α2 σ 2 + 2σ 4 = σ 2 (α2 + 2σ 2 ). 2 (3.48) Gaussian random variables (GRV) admit some nice properties that the summation of any two GRV is still a GRV and the product of two independent GRV is a GRV [73]. But we cannot conclude that Y is a GRV since it includes the X 2 term. We are interested in under what condition Y is close to a GRV. By noting that Y =
  • 104. 91 (X + α/2)2 − (σ 2 + α2 /4), we have X = −α/2 ± Y + (σ 2 + α2 /4), Y ≥ −(σ 2 + α2 /4). (3.49) Since Y is a function of the GRV X, its PDF is thus given by  √ 2 √ 2   − 12 α− 2 y+(σ 2 +α2 /4) − 12 α+ 2 y+(σ 2 +α2 /4)  1 e2σ e 2σ pY (y) = √ + , y ≥ −(σ 2 + α2 /4). (3.50) 2πσ 2  2 y + (σ 2 + α2 /4) 2 y + (σ 2 + α2 /4)  ∞ From PDF’s property, there holds −(σ 2 +α2 /4) pY (y) = 1. Interestingly, the integral of the first term in pY (y) is 1 α √ 2 − − y+(σ 2 +α2 /4) 2σ 2 2 1 ∞ e IY = √ dy 2πσ 2 −(σ 2 +α2 /4) 2 y + (σ 2 + α2 /4) √ 2 −1 ∞ − 1 α − y+(σ +α /4) α 2 2 = √ e 2σ2 2 d[ − y + (σ 2 + α2 /4)] 2πσ 2 −(σ 2 +α2 /4) 2 −1 −∞ z2 α = √ e− 2σ2 dz let : z = − y + (σ 2 + α2 /4) 2 α 2πσ 2 2 1 ∞ z2 ˜ z = √ e− 2σ2 d˜ z let : z = − ˜ 2π 2σ−α σ α = 1−Q , 2σ 2 √1 ∞ − x2 where Q(x) = 2π x e 2σ dx is the error function. Hence it is concluded that if α/σ is sufficiently large, then IY ≈ 1 and thus pY (y) is dominated by the first term. Intuitively, it can be seen that the second term (X 2 − σ 2 ) in Y will fade out since its mean is zero and it has a small variance E{(X2 − σ 2 )2 } = 2σ 4 when α/σ is sufficiently large. It is also easy to see that σY is dominated by α2 σ 2 based on the 2 same assumption. Therefore, the random variable Y = αX + (X 2 − σ 2 ) behaves like normal distributed, provided that α/σ is sufficiently large. Translating this result to
  • 105. 92 ˜ the random variables as in (3.40) with δ tk = −δtk leads to ˜ ˜k Yk = αk δ tk + (δ t2 − σt ), 2 ˆ αk = 2(R1,T + c∆tk,1 )/c. (3.51) Then η1 = [Y2 , Y3 , . . . , YNb ]T is a normally distributed vector, as δtk is Gaussian with 2 mean zero and variance σt . Thus Yk is close to Gaussian provided that αk /σt = ˆ 2(R1,T + c∆tk,1 )/(cσt ) is sufficiently large for all k ≥ 2. It is worth noting that αk 2 ˆ ( ) = (R1,T /c + ∆tk,1 )2 /σt . 2 (3.52) 2σt The right-hand side of the above equation indicates an approximation to the SNR, since its numerator represents the recorded signal of the traveling time from the target 2 to the k-th BS and its denominator, σt , is the noise variance. If αk /σt is sufficiently large, the variance of Yk is, by (3.48), σYk = E{Yk2 } = αk σt + 2σt = σt (αk + 2σt ) 2 2 2 4 2 2 2 (3.53) 2 2 that is dominated by αk σt . It is emphasized that αk = αk (R1,T ) is a function of R1,T . Recall that one question is raised in the previous part that how far is the LS-type solution obtained in (3.46) and (3.47) away from the true ML solution. Here a clear answer is that the LS-type algorithm approximates to the ML solution well as long as αk /σ is very large for 2 ≤ k ≤ Nb . Therefore the properties of the ML algorithm hold approximately. Before ending this subsection, we would like to compute the Cram´r-Rao bound e associated with the weighted LS solution by assuming that {Yk } are normal dis-
  • 106. 93 tributed which holds true approximately under the condition discussed earlier. Re- call that W1 (R1,T ) in the weighted LS problem is the associated covariance matrix. Thus E{Yk2 } is its element and the joint probability density function (PDF) for the pseudo-measurement data {ak } and {bk } in (3.42) is 1 1 T −1 PDF = exp − a − bR1,T − H1 θ W1 (R1,T ) a − bR1,T − H1 θ . (3.54) (2π)n−1 det[W1 (R1,T )] 2 Note that inside the exponent is exactly J1 with a minus sign. The Fisher information matrix for the PDF in (3.42) can be computed by using the Slepian-Bangs formula in (3.32). Here we take the pseudo-measurement vector a as the data vector whose T mean vector and convariance matrix are ma = bR1,T + H1 θ and Ma = E{η1 η1 } respectively. Hence both mean and covariance are functions of (xT , yT ). By some direct calculations, we have     ∂ 2 ∂ 2 ∂xT (b2 x2 + yT + T 2 c2 (x2 xT + y2 yT )) ∂yT (b2 x2 + y T + T 2 c2 (x2 xT + y2 yT ))  .   .  ∂ ma = .  ∂ ma = .  ∂xT  .  ∂yT  .  ∂ 2 ∂ 2 (bNb x2 + yT + 2 x2 + y T + 2  ∂xT T  2 (xNb xT + yNb yT )) c  ∂yT (bNb T c2 (xNb xT + yNb yT )) 2 2 x + b2 c2 2 cos(β1 ) y + b2 c2 2 sin(β1 )  .   .  = . ; = . .  .   .  2 2 x c2 Nb + bNb cos(β1 ) y c2 Nb + bNb sin(β1 ) (3.55) And since Ma is a diagonal matrix whose k-th diagonal element is E{Yk2 } = σ2 (αk + 2 2σ 2 ) with αk = 2 ˆ x2 + yT + 2∆tk,1 , then taking the partial derivative of E{Yk2 } with T 2 c respect to xT and yT gives 2 4σt αk 2 4σt αk ∂ 2 ∂ 2 ∂xT E{Yk } = c cos(β1 ); ∂yT E{Yk } = c sin(β1 ).
  • 107. 94 It is then straightforward to show that 2 4σt cos(β1 ) ∂ xT Ma = diag{ c [α2 , α3 , . . . , αNb ]} ∂ 4σ 2 sin(β ) yT Ma = diag{ t c 1 [α2 , α3 , . . . , αNb ]} (3.56) M−1 = diag{ σ12 [ α2 +2σ2 , α2 +2σ2 , . . . , α2 a 1 1 1 2 ]}. t 2 t 3 t Nb +2σt Now we can calculate the Fisher information matrix via Slepian-Bangs method in (3.32) as 1 ∂Ma −1 ∂Ma ∂ ma T −1 ∂ ma 2,2 Jtdoa,LS = [ tr{M−1 Ma }+( ) Ma ( )] , (3.57) 2 a ∂χi ∂χj ∂χi ∂χj i,j=1,1 where χ1 = xT and χ2 = yT . By substituting (3.55) and (3.56) into (3.57), the Fisher information matrix is given by   T n 2 2 1  2xk /c + bk cos(β1 )   2xk /c + bk cos(β1 )  Jtdoa,LS = 2 2 2    k=2 σt (αk + 2σt ) 2yk /c2 + bk sin(β1 ) 2yk /c2 + bk sin(β1 )   n  cos(β1 )  2 8αk + 2 2   cos(β1 ) sin(β1 ) (3.58) k=2 c2 (αk + 2σt )2 sin(β1 ) The above expression is different from Jtdoa in (3.34) no matter how large αk /σt is and how small σt is. Such a discrepancy is caused by the omission of the second term in pY (y) in computing the Fisher information matrix. The omitted term in pY (y) may have negligible value in computing the probability but its derivative can be significant. Moreover no matter how small σt is, it can not be zero that contributes to this discrepancy. 3.2.3 Location based on AOA The angle of arrival (AOA) of MS signals at a BS can be obtained by antenna arrays. Unlike TOA/TDOA based location methods, we do not need to consider timing syn-
  • 108. 95 chronization problems for an AOA based location algorithm. But there are something in common with TOA/TDOA that we have to fuse either TOA/TDOA or AOA mea- surements into the triangular relations between the BSs and the mobile users, i.e., the ˆ target. Suppose that the AOA measurement data are to be of the form βk = βk + δβk . Recall that tan(βk ) = (yT −yk )/(xT −xk ). That is, βk = βk (xT , yT ). We again assume 2 that {δβk } are uncorrelated with Gaussian distribution of mean zero and variance σβ . Its joint PDF is given by   Nb 1 1 ˆ 2 p∆β (δβ) = exp − 2 βk − βk (xT , yT )  . (3.59) k=1 2σβ Nb (2π)Nb σ β Since the AOA measurements are associated with additive Gaussian noise, it is easy to compute the Fisher information matrix whose inverse matrix is the Cramer-Rao bound for the covariance matrix of the estimation error. Simply speaking, the larger the Fisher information matrix, the smaller the estimation error variance. And that translates into a better estimator in terms of accuracy, provided that it is unbiased. The Fisher information matrix contains the relative rate (derivative) at which the probability density function changes with respect to the data. Note that the greater the expectation of a change is at a give value, say (ˆT , yT ), the easier it is to distinguish x ˆ (ˆT , yT ) from neighboring values (locations), and hence the more precisely (xT , yT ) x ˆ can be estimated at (xT , yT )=(ˆT , yT ). To calcualte the Fisher information matrix, x ˆ we still have to use the Slepian-Bangs formula as in (3.32). First, some primary
  • 109. 96 computations are carried out as ∂βk (xT ,yT ) ∂βk (xT ,yT ) ∂xT = ∂ ∂xT tan−1 ( xT −yk ) y T −xk ∂yT = ∂ ∂yT tan−1 ( xT −yk ) y T −xk (3.60) = − yR2 k ; T −y = xT −xk 2 Rk,T . k,T And we know that the mean vector is mβ = [β1 , β2 , . . . , βNb ]T and the covariance ma- trix is Mβ = INb . With these primary calculation and results, the Fisher information matrix of AOA measurements is given by   yT − yk Nb  −  yT − yk xT − xk 1  Rk,T (xT , yT )  Jaoa =  xT − xk  − σ 2 R (xT , yT )2 k=1 β k,T Rk,T (xT , yT ) Rk,T (xT , yT )  Rk,T (xT yT ) , Nb 1  − sin(βk )  =   − sin(βk ) cos(βk ) . σ 2 R (xT , yT )2 k=1 β k,T cos(βk ) (3.61) With the information matrix above, we can calculate the Cramer-Rao bound (CRB)easily. In terms of CRB, ML estimator is the closest one among all the unbiased estimators. The ML algorithm is to minimize the likelihood function of the following form Nb 2 L∆β (xT , yT ) = ˆ βk − βk (xT , yT ) . (3.62) k=1 Then the necessary condition for (x∗ , yT ) to be ML solution is T ∗     Nb 1  sin(βk )  ˆ ∗ ∗  0    βk − βk (xT , yT ) =  . (3.63) k=1 Rk,T −cos(βk ) 0 No matter how many minimum points the nonlinear likelihood function may have, the true ML solution (x∗ , yT ) must be one of them such that the partial derivative of T ∗ L∆β (xT , yT ) with respect to xT and yT at the location (x∗ , yT ) are zeros. Again this T ∗ is a difficult nonlinear optimization to solve and multiple solutions may exists. Thus we turn our attention to the LS-type algorithm before solving the ML solution.
  • 110. 97 ˆ ˆ Recall that the AOA measurements are given by βk = βk + δβk , or δβk = βk − βk . ˆ Hence Rk,T sin(δβk ) = Rk,T sin(βk − βk ) and thus ˆ ˆ Rk,T sin(δβk ) = Rk,T sin(βk ) cos(βk ) − Rk,T cos(βk ) sin(βk ) (3.64) ˆ ˆ = ∆xk sin(βk ) − ∆yk cos(βk ). where ∆xk = xT − xk , ∆yk = yT − yk , and Rk,T = ∆x2 + ∆yk . It follows that k 2 ˆ ˆ ˆ ˆ ϕk = −xk sin(βk ) + yk cos(βk ) = −xT sin(βk ) + yT cos(βk ) + Rk,T sin(δβk ). (3.65) We can regard ϕk as a pseudo-measurement constituting of the real measurements ˆ data βk and the known BS location (xk , yk ). For the term Rk,T sin(δβk ) at the right side of equation (3.65), we argue that even though {sin(δβk )} are not Gaussian, they 2 are close to Gaussian distributed provided that the variance σβ is adequately small by the fact that with z = sin(δβ) [73], 2 2 ∞ exp − 2σ2 sin−1 (z) + 2kπ 1 + exp − 2σ2 sin−1 (z) + (2k + 1)π 1 β β pZ (z) = 2 k=−∞ 2πσβ (1 − z2) (3.66) 2 for |z| ≤ 1 and pZ (z) = 0 for |z| > 1. Since σβ is sufficiently small, there holds 1 1 1 2 1 − z2 2σ 2 pZ (z) ≈ exp − 2 sin−1 (z) ≈ e β (3.67) 2 2πσβ 2σβ 2 2πσβ for z ≈ 0. The above implies Rk,T sin(δβk ) will behave like a GRV under the condition that δβk is very small. This can also be seen in an approximate way that sin (δβk ) ≈ δβk , if δβk is very small. Hence sin (δβk ) and δβk will almost have the same PDF. We also would like to argue that the probability for |δβ| ≥ π/2 is zero generically. Otherwise it would imply the wrong direction of the angle of arrival completely. Hence
  • 111. 98 the PDF of δβ has a shape similar to the normal function but tends to zero for |δβ| = π/2 and beyond that implies that pZ (z) behaves closely to Gaussian distributed. Even if δβ is normal, the exact variance of sin(δβk ) can be computed as 1 1 1 1 2 E{sin2 (δβk )} = E{1−cos(2δβk )} = − E{ej2δβk +e−j2δβk } = 1 − e−2σβ ≈ σβ 2 2 2 4 2 (3.68) 2 for the case when σβ is sufficiently small. Now the linear equations in (3.65) are of the form       ˆ ˆ  ϕ1   − sin(β1 ) cos(β1 )   R1,T sin(δβ1 )             ϕ2  ˆ  − sin(β2 ) cos(β2 )   xT  ˆ    R2,T sin(δβ2 )    =  + , (3.69)  .   .   .   .   .  y  .   .   .  T  .        ϕNb ˆ ˆ − sin(βNb ) cos(βNb ) RNb ,T sin(δβNb ) The noise vector on the right hand side is denoted by T η2 = R1,T sin(δβ1 ) R2,T sin(δβ2 ) . . . RNb ,T sin(δβNb ) . It has mean zero and covariance matrix W2 (R1,T ) that is diagonal with the k-th element 2 Rk,T 1 − e−2σβ /2 ≈ Rk,T σβ = [(xT − xk )2 + (yT − yk )2 ]σβ . 2 2 2 2 (3.70) With the Gaussian assumption on the noise vector and {ϕk } as pseudo-measurements, (3.69) has the form 1 T −1 ϕ = H2 θ + η2 =⇒ J2 = ϕ − H2 θ W2 (R1,T ) ϕ − H2 θ (3.71) 2 is the objective function. Minimization of J2 corresponds to the ML algorithm. The
  • 112. 99 ML solution is given by ˆ −1 θ = HT W2 (R1,T )H2 2 −1 HT W2 (R1,T )ϕ. 2 −1 (3.72) −1 However W2 (R1,T ) involves the unknown (xT , yT ) and R1,T = x2 + yT , the above T 2 does not give the ML solution explicitly. It is interesting to notice that the weighted LS problem in this subsection is again a constrained LS-type problem. Indeed by noting that Rk,T = x2 + yk + x2 + yT − 2(xk xT + yk yT ) = Rk + RT − 2(xk xT + yk yT ), 2 k 2 T 2 2 2 (3.73) we can multiply (3.72) from left by xk y k for k = 2, · · · , Nb to arrive at −1 ρk,T := xk xT + yk yT = xk yk HT W2 (R1,T )H2 2 −1 HT W2 (R1,T ))ϕ. 2 −1 (3.74) 2 2 2 In addition Rk,T = Rk + RT − 2ρk,T . Thus taking norm square on both sides of (3.72) yields −1 R1,T = Φ2 (R1,T )ϕ 2 , 2 Φ2 (R1,T ) = HT W2 (R1,T )H2 2 −1 HT W2 (R1,T ). (3.75) 2 −1 Consequently we have Nb equations with Nb unknowns {Rk,T }Nb and R1,T . Although k=2 these are nonlinear equations, they can be manipulated to solve at least one set of solutions for these Nb unknowns. These solutions can be substituted back to (3.72) to yield the ML solution (xT , yT ). It is commented that for large Nb , the complexity for quasi-linear localization based on AOAs is much higher than the corresponding localization based on TDOAs. But if we have additional information on TDOAs,
  • 113. 100 then the complexity can be reduced tremendously that will be studied in the next subsection. Before ending this subsection we present the Fisher information matrix associated with the LS-type problem as posed in (3.69). With the assumption on Gaussian distribution for the noise vector η2 , the joint PDF has the expression 1 1 T −1 PDF = exp − ϕ − H2 θ W2 (R1,T ) ϕ − H2 θ (3.76) (2π)n det[W2 (R1,T )] 2 where ϕ can be regarded as pseudo-measurement vector. Hence H2 θ is the mean vector and W2 (R1,T ) is the covariance matrix. An application of the Slepian and Bangs formula gives the corresponding Fisher information matrix:   n 2  cos(βk )  Jaoa,LS = 2   cos(βk ) sin(βk ) (3.77) k=1 Rk,T sin(βk )   n ˆ 1  − sin(βk )  + 2 2   ˆ ˆ − sin(βk ) cos(βk ) k=1 Rk,T σβ ˆ cos(βk )   n ˆ 1  − sin(βk )  ≈ 2 2   ˆ ˆ − sin(βk ) cos(βk ) (3.78) k=1 Rk,T σβ ˆ cos(βk ) 2 where sufficiently small σβ is assumed. It is interesting to observe that the above approximate expression is the same as Jaoa in (3.61) except that {βk } are replaced ˆ by {βk }. 3.2.4 Location based on both TDOA and AOA After discussing the location techniques based on either TDOA or AOA measurements in the previous two sections, we now assume that both AOAs and TDOAs are available
  • 114. 101 for target localization. Though it indicates more information and data are needed and consequently costs are increased for a location system, the improved accuracy may pay off all the expense. Hence it is meaningful to study the location method based on a combination of TDOA/AOA in the case of redundant information available and high location resolution mandated. Assuming the independence of the noises (δtk and δβk ) in measuring the TDOAs and AOAs, the joint PDF is  2 2  Nb ˆ ∆tk,1 − ∆tk,1 (xT , yT ) Nb ˆ βk − βk (xT , yT )   exp− − 2 2  k=2 2σt k=1 2σβ p∆ (δt, δβ) = √ N √ (3.79) N −1 N (2π) b −1 σ b (2π)Nb σβ b t = p∆t (δt)p∆β (δβ). Because of the independence between {δtk }Nb and {δβk }Nb , the Fisher information k=2 k=1 matrix has the expression Jtdoa/aoa = Jtdoa + Jaoa , (3.80) where Jtdoa and Jaoa are the same as in (3.34) and (3.61), respectively. This can be easily shown [74] by ∂[ln(p∆ (δt,δβ))] ∂[(ln p∆ (δt,δβ))] T Jtdoa/aoa = E ∂x ∂x ∂[ln(p∆t (δt))] ∂[ln(p∆β (δβ))] ∂[ln(p∆t (δt))] ∂[ln(p∆β (δβ))] T =E ∂x + ∂x ∂x + ∂x ∂[ln(p∆t (δt))] ∂[(ln p∆t (δt))] T ∂[ln(p∆β (δβ))] ∂[(ln p∆β (δβ))] T =E ∂x ∂x +E ∂x ∂x = Jtdoa + Jaoa . (3.81)
  • 115. 102 With respect to the joint PDF in (3.79), the corresponding likelihood-type function in this case has the form Nb Nb 1 ˆ 2 1 ˆ 2 L(xT , yT ) = 2σ2 c∆tk,1 − Rk,T (xT , yT ) + R1,T + 2 βk − βk (xT , yT ) . k=2 c t k=1 σβ (3.82) The ML algorithm seeks the maximum of the above likelihood function. The necessary condition for it to achieve maximum at (x∗ , yT ) is: T ∗      0  Nb 1  sin(βk )  ˆ   = k=1 Rk,T   βk − βk (x∗ , yT ) T ∗ 0 −cos(βk )   xk − x∗T x∗ (3.83)  + T  Nb  Rk,T R1,T  ˆ + k=2  yk − yT∗ y ∗  c∆tk,1 − Rk,T + R1,T .   + T Rk,T R1,T The Newton-type algorithms can be applied to solve the ML solution. Clearly the ML solution to the above nonlinear equations is hard to compute that may not be a global maximum for L(xT , yT ). An alternative method is the use of LS-type algorithm as in the previous two subsections. One possible way is to compute the constrained LS so- (TDOA) (TDOA) (AOA) (AOA) lutions (ˆT x , yT ˆ ) and (ˆT x , yT ˆ ) based on TDOAs and AOAs separately as in the previous subsections and then combine the two as [53] (AOA) (TDOA) (AOA) (TDOA) xT = γxT ˆ + (1 − γ)ˆT x , yT = γyT ˆ + (1 − γ)ˆT y (3.84) where 0 < γ < 1. Note that Rk,T = R1,T + c∆tk,T can be used in (3.69) to avoid computing Nb unknowns with Nb equations. Indeed the noise terms in (3.69) have
  • 116. 103 zero mean and variance ˆ ˆ E{Rk,T sin2 (βk )} = E{[R1,T + c∆tk,1 − cδtk ]2 sin2 (βk )} ≈ [(R1,T + c∆tk,1 )2 + c2 σt ]σβ 2 2 2 (3.85) 2 if σβ is sufficiently small. Hence only one unknown RT is involved and ρk,T are all eliminated which helps to simplify the computation of the LS-type solution to the target localization problem based on measurements of AOAs. However the determi- nation of the optimal value of γ is not easy. Hence we opt to compute the LS-type solution directly. Since both AOAs and TDOAs are available, we have the following linear equations:         a − bR1,T   H1   xT   η1   =  + . (3.86) ϕ H2 yT η2 Under the independence assumption for the noises η1 and η2 , we have          η1     η1        W1 (R1,T ) 0  E   = 0,  W = E   η1 η2  =    η   η2  0 W2 (R1,T ) 2 (3.87) where the kth diagonal element of W2 (R1,T ) is the same as in (3.85). By assuming uncorrelated Gaussian for η1 and η2 , the ML solution to estimation of (xT , yT ) can be computed through minimization the following objective function: T −1 1 J1,2 = 2 a − bR1,T − H1 θ W1 (R1,T ) a − bR1,T − H1 θ T (3.88) 1 −1 + 2 ϕ − H2 θ W2 (R1,T ) ϕ − H2 θ = J1 + J2 . Taking derivative of the cost function J1,2 with respect to θ, we have ∂J1,2 = −(a − bR1,T − H1 θ)T W1 H1 − (ϕ − H2 θ)T W2 H2 . −1 −1 (3.89) ∂θ
  • 117. 104 ∂J1,2 By letting ∂θ = 0, tt can be easily shown that the minimizer to the cost function J1,2 is given by xT ˆ −1 −1 −1 −1 −1 = HT W1 (R1,T )H1 + HT W2 (R1,T )H2 1 2 HT W1 (R1,T ) a − bR1,T + HT W2 (R1,T )ϕ . 1 2 yT ˆ (3.90) Because the above solution involves an unknown R1,T = x2 + yT , we can again take T 2 norm square both sides to obtain an equation for R1,T first, and after computing its solution, the value of R1,T can be substituted into (3.90) to obtain the solution to the weighted LS problem. Note that R1,T is a positive real root to some nonlinear equation. One of the positive real roots corresponds to the constrained LS solution, which provides an initial guess for the true (nonlinear) ML solution. It is commented that the optimal solution in (3.90) is not in the form of the convex combination of the two separate LS-type solutions as in (3.84). Rather it is in the form       (TDOA) (AOA)  xT  ˆ  xT ˆ   xT ˆ    = Γ  + (I − Γ)   (3.91) (TDOA) (AOA) yT ˆ yT ˆ yT ˆ where Γ is a matrix. Specifically the solution in (3.90) can be written as       (TDOA) (AOA)  xT  ˆ −1 −1  xT ˆ  −1  xT ˆ    = [A1 + A2 ] [B1 + B2 ] = I + A−1 A2 1  −1 + I + A2 A1   (TDOA) (AOA) yT ˆ yT ˆ yT ˆ (3.92) where −1 A1 = HT W1 (R1,T )H1 ; 1 −1 A2 = HT W2 (R1,T )1, T 2 ; 2 −1 B1 = HT W1 (R1,T ) a − bR1,T ; 1 −1 B2 = HT W2 (R1,T )ϕ. 2
  • 118. 105 Hence A−1 B1 and A−1 B2 are the LS-type solution based on TDOAs and AOAs, 1 2 respectively. Now it is straightforward to show that −1 −1 I + A−1 A2 1 + I + A−1 A1 2 = [A1 + A2 ]−1 A1 + [A1 + A2 ]−1 A2 = Γ + [I − Γ] = I. (3.93) Even though the LS solution in (3.90) is some kind of combination of the two separate LS solutions in (3.46) and (3.72), the unknown R1,T has to be computed based on (3.90). Finally the Fisher information matrix associated with the linear model in (3.86) is   T Nb 2 2 1  2xk /c + bk cos(β1 )   2xk /c + bk cos(β1 )  Ptdoa/aoa−f im,LS = 2 2 2    k=2 σt (αk + 2σt ) 2yk /c2 + bk sin(β1 ) 2yk /c2 + bk sin(β1 )   Nb  cos(β1 )  2 8αk + 2 2   cos(β1 ) sin(β1 ) k=2 c2 (αk + 2σt )2 sin(β1 )   Nb ˆ 1  − sin(βk )  + 2 2   ˆ ˆ − sin(βk ) cos(βk ) k=1 Rk,T σβ ˆ cos(βk ) (3.94) 2 2 under the uncorrelated Gaussian assumption and sufficiently small σt and σβ . 3.3 Constrained Least-square Optimization ˆ As shown in 3.46, the weighted LS solution θ is constrained by R1,T = Φ1 [a − bR1,T ] 2 , 2 2 (3.95) from which some solutions R1,T can be solved. If there exist real solutions R1,T , they can be substituted back into J1 in 3.45 and obtain the optimal solution R1,T based on
  • 119. 106 ˆ which the optimal solution θ can be obtained. While this holds, (3.95) may not admit a real solution R1,T due to the existence of noises in observations. More specifically (3.95) is equivalent to the quadratic equation (bT ΦT Φb − 1)R1,T − 2aT ΦT ΦbR1,T + aT ΦT Φa = 0, 2 (3.96) which admit real solution, if and only if (aT ΦT Φb)2 + aT ΦT Φa − (aT ΦT Φa)(bT ΦT Φb) ≥ 0. (3.97) That is, (3.95) admits a real solution R1,T if and only if (3.97) holds. Simulation in [54] shows that the location estimate in (3.46) is very accurate if the condition (3.97) holds; Otherwise the location estimate is far away from the true location. The question is what if (3.97) does not hold which is generically true due to the existence of noise in the TDOA and AOA measurements. Let us examine (3.45) again by rewriting J1 into   T    1  pT  −1   pT  J1 = a − H1 b   W1 a − H1 b   , (3.98) 2 R1,T R1,T where pT = [ xT yT ]T . The nonlinear estimation problem aims to search pT and ˆ ˆ R1,T such that J1 is minimized, subject to the constraint R1,T = pT . Denote Σ = W1 and     pT   −I2 0  A= H1 b , ϕ = a, θ =   , Q =  . R1,T 0 1 Then we have the following more general constrained LS optimization problem: 1 min J1 , J1 = 2 (Aθ − ϕ)T Σ−1 (Aθ − ϕ). (3.99) θ T Qθ=0
  • 120. 107 We will develop a solution algorithm to such a constrained LS optimization problem in the following. Assume that Σ is positive definite, A has full column rank and Q is nonsingular that has both positive and negative eigenvalues, i.e., Q is indefinite. We employ Lagrange multiplier to develop the solution algorithm. Let λ be real and consider 1 J= (Aθ − ϕ)T Σ−1 (Aθ − ϕ) + λθT Qθ . (3.100) 2 Then the necessary condition for optimality yields the condition AT Σ−1 [Aθ − ϕ] + λQθ = 0 ⇔ θ = [AT Σ−1 A + λQ]−1 AT Σ−1 ϕ. (3.101) An optimal solution needs to satisfy the constraint θT Qθ = 0 leading to ϕT Σ−1 A[AT Σ−1 A + λQ]−1 Q[AT Σ−1 A + λQ]−1 AT Σ−1 ϕ = 0. (3.102) The solution algorithm hinges to the computation of the real root λ from the above equation and there can be more than one such real root. We employ the result of simultaneous diagonalization. Because Σ = ΣT > 0 and Q = QT > 0, there exists a nonsingular matrix S such that AT Σ−1 A = SDΣ ST and Q = SDQ ST where DΣ and DQ are both diagonal. It is noted that DΣ and DQ have the same inertia as Σ and Q, respectively. It follows that (3.102) is equivalent to (S−1 AT Σ−1 ϕ)T (λI + DΣ D−1 )−1 D−1 (λI + DΣ D−1 )−1 (S−1 AT Σ−1 ϕ) = 0. Q Q Q (3.103) Let D−1 = diag(q1 , q2 , . . . , ql ) with l×l the size of Q. Then it has the same number of Q negative and positive elements as D = DΣ D−1 = diag(d1 , d2 , . . . , dl ) by the positivity Q
  • 121. 108 of Σ and DΣ . In fact, qi di > 0. The matrices S and D can be obtained by eigenvalue decomposition of AT S−1 AQ−1 = SDS−1 . Let vi be the i-th element of S−1 AT Σ−1 ϕ. Then (3.103) is converted into the following: l 2 −1 T −1 T qi vi (S A Σ ϕ) (λI+DΣ D−1 )−1 D−1 (λI+DΣ D−1 )−1 (S−1 AT Σ−1 ϕ) Q Q Q = 2 = 0. i=1 (λ + di ) (3.104) We comment that the above has real roots by examining the summation at λ ≈ −di and by the fact that {qi } have both positive and negative values but not zero. Recall the assumption on Q. However there are only finitely many real λ values satisfying (3.104), which are denoted by {λk }. Now by (3.101), Aθ − ϕ = [A(AT Σ−1 A + λk Q)−1 AT Σ−1 − I]ϕ = (AQ)−1 (λk I + AT Σ−1 (AQ)−1 )−1 AT Σ−1 − I ϕ = (λk I + AQ−1 AT Σ−1 )−1 AQ)−1 AT Σ−1 − I ϕ (3.105) = −λk (λk I + AQ−1 AT Σ−1 )−1 ϕ = −λk Σ(λk Σ + AQ−1 AT )−1 ϕ. Substituting the above into the performance index J in (3.100) leads to 2J = λ2 ϕT (λk Σ + AQ−1 AT )−1 Σ(λk Σ + AQ−1 AT )−1 ϕ. k (3.106) Let λopt be the value that minimizes J over {λk }. Then in light of (3.101), the optimal k θ is obtained as θopt = [AT Σ−1 A + λopt Q]−1 AT Σ−1 ϕ. k (3.107) To facilitate the MATLAB programming in simulation for roots computation we can convert (3.104) to l 2 qi vi (λ + dk )2 = 0. (3.108) i=1 k=i
  • 122. 109 Obviously the solution algorithm above is developed for location estimation with TDOA measurements available only. If both TDOA and AOA measurements are collected, as discussed in the previous section, the extra redundancy indicates an improved accuracy. According to (3.86), we can formulate it into a similar constrained LS optimization problem. Denote            W1   H1 b   pT   a   −I2 0  Σ= ; A =  ;θ =  ;ϕ =  ;Q =  . W2 H2 0 R1,T φ 0 1 (3.109) Then we can use the same Lagrange multiplier method to give a solution.
  • 123. 110 3.4 Simulations In this section, we present a set of simulation results that demonstrate the perfor- mance of our proposed estimation algorithm. In the simulation, there are nine base stations which are equally spaced around a circle. In real WiMax system, the base stations may not exactly locate on a cir- cle. This is simply for ease of presentation and it is not necessarily required in our algorithm which is applicable to any geographical distribution of any number of base stations. To test the accuracy of our location method, ten positions for the mobile user are chosen and they are distributed around a smaller circle too. For the same purpose of an easy demonstration, the above assumption about the MS route is made. The configuration is shown in Figure 3.5. 4 x 10 1 BS4 0.8 BS5 BS3 0.6 MS4 MS3 0.4 MS5 MS2 0.2 y: in meters BS6 MS1 BS2 0 MS6 BS1 −0.2 MS7 MS10 −0.4 MS8 MS9 −0.6 BS7 BS9 −0.8 BS8 −1 −1 −0.5 0 0.5 1 x: in meters x 10 4 Figure 3.5: Base stations and mobile user locations
  • 124. 111 The base stations are at BS1 = [0, 0]T , BS2 = [32000, 0]T , BS3 = [22627, 22627]T , BS4 = [0, 32000]T , BS5 = [−22627, 22627]T , BS6 = [−32000, 0]T , BS7 = [−22627, −22627]T , BS8 = [0, −32000]T , BS9 = [22627, −22627]T . The unit is in meters. For each MS position, total number of 2000 different data sets are run and the MS location is obtained by averaging over all the 2000 estimates. In the experiments, our location algorithm is simulated for TDOA data only and for a combination of AOA and TDOA data, respectively. In Figure 3.6, the green line is the result from a combination of AOA and TDOA data when the SNR’s are SNRtdoa = 20dB and SNRaoa = 20dB respectively. It almost merges with the blue line which represents the real MS positions and is invisible in the figure. It shows the high accuracy of the estimation algorithm we propose in this thesis. With the same SNRtdoa = 20dB, the cyan line is the estimation result from the TDOA data only. It can be seen that there is small deviation from the real position. Intuitively, with the extra information from AOA measurement, the result in the green line is expected to be closer to the real positions. From the Fisher information matrices we calculated in the previous sections, the Cramer-Rao bound for the combination data of TDOA and AOA should be smaller than that of TDOA data only. To have a closer look at the performance of the proposed algorithm, we calculate the approximate mean and standard deviation of the estimation error, i.e., the dis- tance between the real MS position and the estimated position. It is obtained from a sample space of 2000 data points. In Figure 3.7, the average estimation error is less
  • 125. 112 4 x 10 1 BS 0.8 Known TDOA 0.6 AOA+TDOA 0.4 0.2 y: in meters 0 −0.2 −0.4 −0.6 −0.8 −1 −1 −0.5 0 0.5 1 x: in meters x 10 4 Figure 3.6: Location estimation with TDOA-only and AOA+TDOA data than 4 meters for all the ten MS locations when the TDOA data is of high SNR. To study the effect of SNR on the performance of the proposed location algorithm, the MS position at MS2 is randomly selected and the mean and the standard deviation of the estimation error vary with SNR as shown in Figure 3.8. It is easily seen that at a low SNR, the estimation is not accurate enough and it is because our assumption about the measurement noise variance is not valid. According to the FCC regulations, it requires that for 67% of the E911 calls, the wireless service providers must provide an estimated location with location error below 100m. As shown in Figure 3.9, the location error is below 100m for 98% of the time with SNRtdoa = 40dB. It is well above the requirement from FCC. From the above figures, it is demonstrated that the proposed algorithm can provide
  • 126. 113 TDOA 4.5 mean std mean and standard deviation (in meters) 4 3.5 3 2.5 2 1 2 3 4 5 6 7 8 9 10 mobile station positions (no unit) Figure 3.7: Location estimation performance TDOA 450 mean 400 std mean and standard deviation (in meters) 350 300 250 200 150 100 50 0 20 25 30 35 40 45 50 55 60 SNR (in dB) Figure 3.8: Effect of SNR on estimation accuracy
  • 127. 114 AOA+TDOA 100 90 80 70 FCC Requirement 1−Outrage (%) 60 50 40 30 20 10 0 0 20 40 60 80 100 Location error (meter) Figure 3.9: Outrage curve for location accuracy accurate estimation for the MS location. It also meets the FCC requirement for out- door network-based wireless location. 3.5 Chapter Summary In this chapter, an introduction about WiMax networks and its IEEE standard evolu- tion and applications in most aspects is given and the outdoor/indoor wireless location technologies based on measurements of TOA’s, TDOA’s, AOA’s and amplitudes are reviewed. With measurements of TDOA and AOA available, we present a constrained LS- type algorithm to estimate the target location. The proposed method is different from the commonly used ML algorithm, though the latter is heavily preferred in
  • 128. 115 some applications for its superior performance. Because of the large number of obser- vation data and the additive measurement noise, maximizing the likelihood function involves a great amount of computational load. It even does not guarantee that the optimal estimation can be obtained due to the existence of local minimum. Under the assumption of zero-mean additive Gaussian noise with a very small variance, the location estimation problem is formulated into a quasi-linear form, which is solvable by the LS algorithm. The assumption is usually validated as in [54]. Therefore, our method holds the preferable properties of the ML algorithm in the sense that it approaches the Cramer-Rao bound with a large sample of observation data. More importantly, the computational complexity is reduced by the LS algorithm. As shown ˆ in this chapter, the LS algorithm also involves a constraint that θ = R1,T . The target location can only be obtained by substituting the intermediate LS solution into the constraint and solving the resultant quadratic equation. It brings complexity back to the solution. Hence the Lagrange multiplier is explored to solve the above constrained LS optimization problem. The simulation results show that our scheme is effective in location estimation.
  • 129. Chapter 4 Conclusions This dissertation, in the first part, addresses the problem of channel estimation of MIMO-OFDM systems. It starts from the matrix representation of the signal model of MIMO-OFDM systems, which clearly describes the relation of signals in frequency domain and time domain and expressing operations like adding CP and removing CP as matrix product. From the resulting MIMO-OFDM signal model, a pilot tone based channel estimation is proposed to estimate the fast time-varying and frequency- selective fading channel via the least-squares method. The least-squares is selected for the purpose of low complexity, though some other methods such as MMSE and ML may produce better estimation performance. To further reduce the computa- tional complexity, the pilot tone matrix is designed as a unitary matrix to save the computation of the matrix inversion in the standard LS solution. The pilot tone matrix is designed in a simple way that Nt disjoint pilot tone sets are placed at one OFDM block on each transmit antenna. Each pilot tone set has L pilot tones which are equally-spaced and equally-powered. By choosing the pilot tones based on our de- 116
  • 130. 117 sign, those pilot tones comprise a unitary matrix. For a simple 2 × 2 case, Alamouti’s orthogonal structure is exploited. And the design can be readily extended to a config- urable MIMO-OFDM system with any number of transmit and receive antennas. For a fixed power of pilot tones, our design can be proved to be also optimal in the sense of achieving the minimum MSE of channel estimation. Compared with some relative pilot tone designs in the literature, our channel estimation method differs in its ability to estimate fast time-varying wireless channel since pilot tones are inserted into each OFDM block, and in its explicit relation with space-frequency code design which can benefit the channel estimation in return. Seeking for a robust channel estimator with lower complexity for MIMO-OFDM systems, we are looking at the following aspects in the future. • Less overhead loss: It is worth noting that the use of pilot symbols for channel estimation decrease the spectrum efficiency of the wireless communication sys- tems. It is a trade off between data throughput and estimation accuracy. It is of interest to investigate a scheme with even fewer number of pilot tones in each OFDM block by exploiting some statistical properties of the wireless channel itself. Intuitively, it is the best balance between overhead loss and estimation reliability if we can adaptively change the number of pilot tones depending on the channel condition through some feedback information. • Joint channel estimation and CFO correction: Usually when we design the chan- nel estimator, we assume that the OFDM system is perfectly synchronized and
  • 131. 118 there is carrier frequency offset at all. And some CFO compensation algorithms are also based on the assumption that channel is known at the receiver. It would be beneficial to combine the channel estimation and CFO compensation into an integrate algorithm since the performance of either one of the two individual al- gorithms can be degrade by the invalidity of their assumptions in the real world OFDM systems. There are already some research work in this area [34, 35], but more intensive investigation is still needed. But we still have to consider the data rate loss caused by the pilot-tone overhead within each OFDM block. We are currently working on this issue with a goal that we can use a sequence of pilot-tones with length less than the channel length by exploring its diversity in the time domain. In the second part of this dissertation, the wireless location on WiMax network is studied. Similar to the location technology applied to the cellular networks, the application scenario of locating the mobile user by using some signal parameters received at the antenna towers is considered. Location estimation methods based on TDOA, AOA and a combination of TDOA and AOA are presented, respectively. With the assumption that the measurement noise is zero-mean additive Gaussian noise with very small variance, the location estimation problem is formulated into a quasi-linear form. Then the simple LS algorithm can be used to solve the estimation problem, provided that the noise term in the quasi-linear form is Gaussian. In theory, the ML algorithm can be directly utilized to estimate the target location since the probability
  • 132. 119 density function of the observation data is known with our assumption. However, direct use of ML algorithm proves infeasible because of the difficulty of finding the real roots of a quadratic equation. An alternative to the ML algorithm is required, which should drastically reduce the complexity of the ML algorithm and provide a close performance. Our proposed method is such an alternative that it is essentially a constrained LS-type optimization technique. The approximation of the noise term in the quasi-linear form to a Gaussian random is also proved in this thesis under the assumption above. Hence it is concluded that the proposed method can estimate the target location very accurately, provided that the size of the observation data is large enough and the equivalent SNR is high. To solve the constrained LS-type optimization problem, the Lagrange multiplier method is used. It is because that the direct use of the constraint condition may lead to the same level of complexity for the algorithm and even positive real roots may not exist in the quadratic equation obtained by substituting the intermediate LS solution into the constraint. Finally,the extensive simulation studies has demonstrated the effectiveness of our proposed algorithm. For future work on wireless location problem, the following aspects are open for research. • Large variance: The approximation of the constrained LS-type optimization to the ML algorithm is dependent on the assumption that the measurement noise variance is very small, which is usually true. Further research on the case of
  • 133. 120 measurement noise with relatively large variance will improve the robustness of the proposed algorithm. • Velocity Estimation: In the thesis, the target is considered stationary by assum- ing it is moving at a low speed. If the FDOA (frequency difference of arrivals) of the received signal is available, then the velocity of the target can be estimated too. This will extend the range of applications of the proposed algorithm.
  • 134. Bibliography [1] Richard Van Nee and Ramjee Prasad, OFDM For Wireless Multimedia Commu- nications, Artech House Publishers, Norwood MA, 2000. [2] R. W. Chang, “Synthesis of band-limited orthogonal signals for multichannel data,” BSTJ., pp. 1775-1797, Dec. 1996. [3] B. R. Saltzburg, “Performance of an efficient parallel data transmission systems,” IEEE Trans. on Comm. Tech., pp. 805-811, Dec. 1967. [4] S. B. Weinstein and P. M. Ebet, “Data transmission by frequency-division multi- plexing using the discrete Fourier transform,” IEEE Trans. on Commun., COM- 19(5), pp. 628-634, Oct. 1971. [5] L.J. Cimini, Jr., “Analysis and simulation of a digital mobile channel using or- thogonal frequency division multiplexing,” IEEE Trans. on Communications., vol. 33, pp. 665-675, July 1985. [6] A. Peled and A. Ruiz, “Frequency domain data transmission usng reduced com- putational complexity algorithms,” In Proc. IEEE ICASSP, pp. 964-967, Denver, CO, 1980. [7] A. Vahlin and N. Holte, “Optimal finite duration pulses for OFDM,” IEEE Trans. Commun., 44(1), pp. 10-14, Jan. 1996. [8] B. Le Floch, M. Alard and C. Berrou, “Coded orthogonal frequency-division multiplexing,” Proc. IEEE, 83(6), pp. 982-996, Jun. 1995. [9] T. Pollet, M. Van Bladel and M. Moeneclaey, “BER sensitivity of OFDM systems to carrier frequency offset and Wiener phase noise,” IEEE Trans. on Comm., Vol. 43, No. 2/3/4, pp. 191-193, Feb.-Apr., 1995. [10] P. H. Moose, “A technique for orthogonal frequency division multiplexing fre- quency offset correction,” IEEE Trans. on Comm., Vol. 42, No. 10, pp. 2908- 2914, Oct., 1994. [11] T. M. Schmidl and D. C. Cox, “Robust frequency and timing synchronization for OFDM,” IEEE Trans. on Comm., Vol. 45, No. 12, pp. 1613-1621, Dec., 1997. 121
  • 135. 122 [12] Van Nee and R. D. J., “OFDM codes for peak-to-average power reduction and er- ror correction,” IEEE Global Telecommunications Conference, London, pp. 740- 744, Nov., 1996. [13] J. A. Davis and J. Jedwab, “Peak-to-average power control in OFDM, Golay complementary sequences and Reed-Muller codes,” HP Laboratories Technical Report, HPL-97-158, Dec., 1997. [14] A. Tarighat and A. H. Sayed, “MIMO OFDM receivers for systems with IQ imbalances,” IEEE Transactions on Signal Processing, vol. 53, no. 9, pp. 3583- 3596, Sep. 2005. [15] A. Tarighat, R. Bagheri, and A. H. Sayed, “Compensation schemes and perfor- mance analysis of IQ imbalances in OFDM receivers,” IEEE Transactions on Signal Processing, vol. 53, no. 8, pp. 3257-3268, Aug. 2005. [16] S. Alamouti, “A simple transmit diversity technique for wireless communica- tions,” IEEE J. Select. Areas Communication, vol. 16, pp. 1451-1458, Oct., 1998. [17] G. J. Foschini, “Layered space-time architecture for wireless communication in a fading environment when using multi-element antennas,” Bell Labs. Tech. J., pp. 41-59, Autumn, 1996. [18] V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time block codes from orthogonal designs,” IEEE Trans. Inform. Theory, vol. 45, pp. 1456-1467, July 1999. [19] V. Tarokh, N. Seshadri, and A. R. Calderbank, “Space-time codes for high data rate wireless communications: Performance criterion and code construction,” IEEE Trans. Inform. Theory, vol. 44, pp. 744-765, March 1998. [20] T. L. Marzetta and B. M. Hochwald, “Capacity of a mobile multiple-antenna communication link in Rayleigh flat fading,” IEEE Trans. Inform. Theory, vol. 45, pp. 139-157, Jan. 1999. [21] G. J. Foschini and M. J. Gans, “On limits of wireless communications in a fading evvironments when using multiple antennas,” Wireless Pers. Commun., vol. 6, no. 3, pp. 311-335, Mar. 1998. [22] E. Telatar, “Capacity of multi-antenna Gaussian channels,” Euro. Trans. Co- mun., vol. 10, no. 6, pp. 585-595, Nov.-Dec. 1999. [23] A. Wittneben, “A new bandwidth efficient transmit antenna modulation diver- sity scheme for linear digital modulation,” Proc. ICC, pp. 1630-1634, 1993.
  • 136. 123 [24] Jan Mietzner and Peter A. Hoeher, “Boosting the performance of wireless com- munication systems: theory and practice of multiple-antenna techniques,” IEEE Communicatin Magazine, no. 10, pp. 40-47, Oct. 2004. [25] T. M. Marzetta and B. M. Hochwald, “Capacity of a mobile multiple-antenna communication link in Rayleigh flat fading ,” IEEE Trans. Inform. Theory, vol. 45, no. 1, pp. 139-157, 1999. [26] L. Zheng and D. N. C. Tse, “Communication on the Grassmann manifold: A geometric approach to the noncoherent multiple-antenna channel ,” IEEE Trans. Inform. Theory, vol. 48, no. 2, pp. 359-383, Feb. 2002. [27] I. Barhumi, G. Leus and M. Moonen, “Optimal training design for MIMO OFDM systems in mobile wireless channels,” IEEE Trans. Signal Processing, vol. 51, No. 6, pp. 1615-1624, Jun. 2003. [28] Allert van Zelst and Tim C. W. Schenk, “Implementation of a MIMO OFDM- based Wireless LAN system,” IEEE Trans. Signal Processing, vol. 52, No. 2, pp. 483-494, Feb. 2004. [29] X. Li, H. Huang G. J. Foschini and R. A. Valenzuela, “Effects of iterative detec- tion and decoding on the performance of BLAST,” IEEE Proc. Global Telecom- mun. Conf., vol. 2, No. 2, pp. 1061-1066, 2000. [30] A. Salvekar, S. Sandhu, Q. Li, M. Vuong and X. Qian, “Multiple-Antenna Tech- nology in WiMax Systems,” Intel Technology Journal, vol. 8, No. 3, [online]: http://www.intel.com/technology/itj/2004/volume08issue03, Aug. 2004. [31] Hongwei Yang, “A road to future broadband wireless access: MIMO-OFDM- Based air interface,” IEEE Communications Magazine, Vol. 43, No. 1, pp. 53 - 60, Jan. 2005. [32] H. B¨lcskei, M. Borgmann and A. J. Paulraj, “Impact of the propagation envi- o ronments on the performance of space-frequency coded MIMO-OFDM,” IEEE J. Select. Areas Commun., vol. 21, No. 3, pp. 427-439, Apr. 2003. [33] H. B¨lcskei, and A. J. Paulraj, “Space-frequency coded broadband OFDM sys- o tems,” Proc. IEEE WCNC, pp. 1-6, Chicago, IL, Sep. 2000. [34] X. Ma, H. Kobayashi and S. C. Schwartz, “Joint frequency offset and chanel estimation for OFDM,” Proc. of Global Telecommun. Conf., pp. 15-19, Dec. 2003. [35] P. Stoica and O. Besson, “Training sequence design for frequency offset and frequency-selective channel estimation,” IEEE Trans. on Commun., vol. 51, No. 11, pp. 1910-1917, Nov. 2003.
  • 137. 124 [36] Nima Khajehnouri and Ali H. Sayed, “Adaptive angle of arrival estimation for multiuser wireless location systems,” Fifth IEEE Workshop on Signal Processing Advances in Wireless Communications, Lisboa, Portugal, July 11-14, 2004. [37] Part 11: Wireless LAN Medium Access Control (MAC) and Pyhsical Layer (PHY) Specifications—Amendment 1: High-speed Phyisical Layer in the 5 GHz Band, IEEE Standard 802.11a-1999. [38] M. Brookers, “Matrix Reference Manual [online]”, available: http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/. [39] Part 11: Wireless LAN Medium Access Control (MAC) and Pyhsical Layer (PHY) Specifications—Amendment 1: High-speed Phyisical Layer in the 5 GHz Band, IEEE Standard 802.11a-1999. [40] Part 16: Air Interface for Fixed Broadband Wireless Access Systems— Amendment 2: Medium Access Control Modifications and Additional Pyhsical Layer Specifications for 2-11 Ghz, IEEE Standard 802.16a-2003. [41] Digital broadcasting systems for television, sound and data services. European Telcommunications Standard, prETS 300 744 (Draft, version 0.0.3), Apr. 1996. [42] H. Sampath, S. Talwar, J. Tellado, V. Erceg and A. Paulraj, “A fourth-generation MIMO-OFDM broadband wireless system: design, performance and field trial results,” IEEE Communications Magazine, No. 9, pp. 143-149, Sep., 2002. [43] Justin Chuang and Nelson Sollenberger, “Beyond 3G: Wideband wireless data access based on OFDM and dynamic packet assignment,” IEEE Communications Magazine, No. 7, pp. 78-87, Jul., 2000. [44] Z. Liu, G. Giannakis, S. Barbarosa, and A. Scaglione, “Transmit-antennae space- time block coding for generalized OFDM in the presence of unknown multipath,” IEEE J. Select. Areas Communication, vol. 19, no. 7, pp. 1352-1364, Jul. 2001. [45] S. Yatawatta and A. P. Petropulu, “Blind channel estimation in MIMO OFDM systems,” IEEE Trans. Signal Processing, submitted, http://www.ece.drexel.edu/CSPL/publications/ssp03sa -rod.pdf [46] H. B¨lcskei, R. W. Heath Jr. and A. Paulraj, “Blind channel identification and o equalization in OFDM-based multiantenna systems,” IEEE Trans. Signal Pro- cessing, vol. 50, No. 1, pp. 96-109, Jan. 2002. [47] Y. Li, N. Seshadri and S. Ariyavisitakul, “Channel estimation for OFDM systems with transmitter diversity in mobile wireless channels,” IEEE J. Select. Areas Communication, vol. 17, pp. 461-471, March 1999.
  • 138. 125 [48] Y. Li, “Simplified channel estimation for OFDM systems with multiple transmit antennas,” IEEE Trans. Wireless Communications, vol. 1, No. 1, pp. 67-75, Jan. 2002. [49] R. Negi and J. Cioffi, “Pilot tone selection for channel estimation in a mobile OFDM system,” IEEE Trans. Cosumer Electronics, vol. 44, No. 3, pp. 1122-1128, August 1998. [50] G. L. St¨ber, J. R. Barry, S. W. Mclaughlin, Y. Li, M. A. Ingram and T. G. u Pratt, “Broadband MIMO-OFDM wireless communications,” Proceedings of the IEEE, vol. 92, No. 2, pp. 271-294, Feb. 2004. [51] W. C. Jakes, Microwave Mobile Communications, John Wiley and Sons, New York, 1974. [52] R. O. Schmidt, “Multiple emitter location and signal parameter estimation”, in Proc. RADC, Spectral Estimation Workshop, Rome, NY, pp. 243-258. [53] A. H. Sayed, A. Tarighat, and N. Khajehnouri, “Network-based wireless loca- tion,” IEEE Signal Processing Magazine, vol. 22, no. 4, pp. 24-40, July 2005. [54] K. C. Ho and Wenwei Xu, “An accurate algebraic solution for moving source lo- cation using TDOA and FDOA measurements,” IEEE Trans. Signal Processing, vol. 52, no. 9, pp. 2453-2463, Sep. 2004. [55] “Wireless location technologies and service [online],” available: http://www.3gamericas.org/English/ [56] PELORUS Group. Report on wireless location-based markets. Technical Report, 2001 [57] In-Stat/MDR. Location-based services: Finding their place in the market . Tech- nical Report, Feb. 2003 [58] A. H. Sayed and N. R. Yousef, Wireless location. Wiley Encyclopedia of Telecom- munications, J. Proakis, editor, John Wiley & Sons, NY, 2003 [59] FCC Docket No. 94-102. Revision of the commissions rules to issue compatability with enhanced 911 emergency calling systems. Technical Report RM-8143, July 1996. [60] State of New Jersey. Report on the New Jersey wireless enhanced 911 terms: The first 100 days. Technical Report, Jun. 1997 [61] M. Yunos, J. Zeyu Gao and S. Shim, Wireless advertising’s challenges and op- portunities. IEEE Computer Magazine, vol. 36, No. 5, pp. 30-37, May, 2003
  • 139. 126 [62] Telecommunications Industry Association. The CDMA2000 ITU-R RTT Candi- date Submission V0.18, Jul. 1998. [63] J. J. Caffery and G. L. Stuber, “Overview of radiolocation in CDMA cellular systems,” IEEE Communications Magazine, vol. 36, No. 4, pp. 38-45, Apr. 98. [64] H. Krim and M. Viberg, “Two decades of array signal processing research: Te parametric approach,” IEEE Signal Processing Magazine, vol. 13, No. 4, pp. 67-94, Jul. 1996. [65] T. Ojanpera and R. Rrasad, Wideband CDMA for third generation mobile com- munications. Arech House, Boston, MA 1998. [66] R. Rrasad, W. Mohr and W. Konhauser, Third generation mobile communica- tions. Arech House, Boston, MA 2000. [67] P. Bahl and V. N. Padmanabhan, “Radar: an in-building RF-based user location and tracking system,” Proc. IEEE Conference INFOCOMM, Vol. 2, pp. 775-784, Tel Aviv, March 2000. [68] T. Ross, P. Myllymaki and H. Tirri, “A statistical modeling approach to location estimation,” IEEE Trans. On Mobile Computing, Vol. 1, No. 1, pp. 59-69, Jan. 2002. [69] M. Youssef, A. Agrawala and A. U. Shankar, “WLAN location determination via clustering and probability distributions,” Proc. IEEE Conference PerCom, pp. 143-150, March 2003. [70] G. H. Golub and C. F. Van Loan, “Matrix Computations”, 2nd Edition, Balti- more: The Johns Hopkins University Press, 1989. [71] John G. Proakis, “Digital Communications”, 4th Edition, Prentice Hall, New Jersey, 2000 [72] Jerry M. Mendel, “Lessons in estimation theory for signal processing, commu- nications and control,” 2nd Edition, Prentice Hall PTR, Englewood Cliffs, New Jersey, March 1995. [73] Athanasios Papoulis and S. Unnikrishna Pillai, “Probability , Random Variables and Stochastic Processes,” 4h Edition, McGraw-Hill, Dec. 2001. [74] P. Stoica, and R. Moses, “Introduction to Spectral Analysis.” Upper Saddle River, NJ: Prentice Hall, 1997.
  • 140. Vita Zhongshan Wu was born in Anhui, China, on December 4, 1974. He received his bachelor of science degree in electrical engineering from Northeastern University in July 1996. In spring 2000, he entered the graduate program in the Department of Electrical and Computer Engineering at Louisiana State University. He got his master of science degree in electrical engineering in December 2001. Now he is a candidate for the degree of doctor of philosophy in electrical engineering. 127