SlideShare a Scribd company logo
Robust Sound Field Reproduction against
Listener’s Movement Utilizing Image Sensor
Toshihide Aketo,Hiroshi Saruwatari,Satoshi Nakamura
(Nara Institute of Science and Technology, Japan)
Outline

Research background
Conventional method
Spectral Division Method
Local sound field synthesis

Proposed method
Equiangular filter
Sound field reproduction system utilizing image sensor

Simulation experiment
Subjective assessment
on directional perception
on sound quality
Research background (1/3)
Objective of sound field reproduction (SFR) system
To reproduce the primary sound field to another space with wide range
and high accuracy.
However, it is difficult to realize such a system because the system size
becomes larger and the system configuration becomes complex.
Therefore, the recent research is focused on reproducing sound field with wide
range and high accuracy using small and simple system.
Surrounded
(large and complex)

Circular or spherical
(a little complex)

Linear or planer
(simple)

Boundary surface control
(BoSC)

Ambisonics
Stereo or surround system

Wave field synthesis
(WFS)

Focused
Complex

Simple
Research background (2/3)
Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008]
One of the SFR methods that reproduces the sound field by synthesizing a
number of wavefronts.
This method can be realized with a simple system like linear loudspeaker
array.

However, SDM has two problems.
Problem 1: A sound pressure error is occurred by mismatching the
reference listening line.
Problem 2: A disturbance of wavefront is occurred by a spatial aliasing.

Reproduction accuracy: Low
Reproduction region: Wide

High

We aim to reproduce the sound field with high
accuracy by solving these problems in SDM.
Research background (3/3)
To cope with these problems, we propose the novel SFR system with
linear loudspeaker array, which combines listener’s position
estimation by Kinect and SDM with local sound field synthesis.

Image sensor
Kinect

Local sound
field synthesis
Reproduction accuracy

Low
Reproduction region:

Wide

Reproduction accuracy:

High
Reproduction region:

localized around listener
Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008]
Primary source

Primary source
nth secondary
source

nth secondary
source

Reference
listening line

Reference
listening line

Spatial domain

IDFT
Fourier transform

Wavenumber domain

The driving function in the wavenumber domain

The driving function in the spatial domain

: angular frequency
: wavenumber in

: speed of sound
-direction

: imaginary unit

: reference listening distance

: zero-th order modified Bessel function of the second kind

: zero-th order Hankel function of the second kind
Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008]
Primary source

Primary source
nth secondary
source

nth secondary
source

Reference
listening line

Reference
listening line

Spatial domain

IDFT
Fourier transform

Wavenumber domain

The driving function in the wavenumber domain

The driving function in the spatial domain
:reference listening distance

Problems in SDM
A sound pressure error is occurred by mismatching the reference listening line.
A disturbance of wavefront is occurred by a spatial aliasing.
Problem 1 : sound pressure error
A sound pressure is correctly reproduced only on the reference
listening line under 2.5-dimensional synthesis condition.
Sound pressure is correctly
reproduced on the
reference listening line.

2.0

2.0

1.0

1.0

0.0

0.0
-1.0

0.0

1.0

Primary sound field

-1.0

0.0

Sound pressure error
1.0

occurs outside the
reference listening line.

Reproduced sound field

Therefore, to correctly reproduce the sound field to listener's position,
we must set the reference listening distance equal to listener's distance.
Problem 2: spatial aliasing (1/2)

0

10

-24

0

-48

0

R参
加

-30

30

20

0

10

-24

0

-48

-30

0

30

Spectral overlap occurs
Discretization of the secondary source

Magnitude[dB]

20

Magnitude [dB]

In SDM, a spectral overlap of the driving function is occurred by
discretization of secondary source, and filter power at high frequency
becomes larger like in the right figure.
Problem 2: spatial aliasing (2/2)
The effect of spectral overlap in the wavenumber domain appears as a
spatial aliasing in the spatial domain.

1.5
0.00
0.0
-1.5

0.0

1.5

-0.10

3.0

Synthesized wavefront
(discrete array)

0.10

1.5
0.00
0.0
-1.5

0.0

1.5

Amplitude

0.10
Amplitude

3.0

Synthesized wavefront
(continuous array)

-0.10

Disturbance of wavefront occurs
Discretization of the secondary source
Local sound field synthesis (1/2) [J. Ahrens, S. Spors., 2011]

0

10

-24

0

-48

-30

0

30

Spectral overlap occurs

20

0

10

-24

0

-48

-30

0

30

Spectral overlap is suppressed

Rectangular window for the spectrum of the driving function
By applying a rectangular window to a spectrum in the left figure, we
enable to suppress a spectral overlap like in the right figure.

Magnitude[dB]

20

Magnitude[dB]

Local sound field synthesis: the method enables to suppress a spatial
aliasing by limiting spatial bandwidth in the wavenumber domain.
Local sound field synthesis (2/2) [J. Ahrens, S. Spors., 2011]
By applying a rectangular window, we enable to suppresses a
disturbance of wavefront and enable to increase the maximum
frequency in which the sound field can be correctly reproduced.
Synthesized wavefront (unfiltered)

Synthesized wavefront (filtered)

0.0
-1.5

0.0

1.5

-0.10

Spatial aliasing occurs

1.5
0.00
0.0
-1.5

0.0

1.5

Amplitude

0.00

Amplitude

1.5

0.10

3.0

0.10

3.0

-0.10

Disturbance of wavefront is suppressed
Reproduction area is localized

Therefore, It is necessary to design a filter to precisely control the
reproduced direction in order to take advantage of this method.
Equiangular filter
In order to design a filter to accurately control the reproduced direction,
we derive the relation equation between reproduced direction ,
wavenumber in -direction
and frequency .
constant
proportional
: wavenumber in

-direction

: speed of sound

:reproduced direction
: frequency

If reproduced direction is constant, since it is found that
proportional to , we design a new filter as follows

: angular frequency
: angular width

: wavenumber
: equiangular filter

is
Result of applying the equiangular filter (1/2)
An example when we applied a designed filter to a spectrum

0

10

-24

0

-48

-30

0

30

Spectral overlap occurs

and the angular width

is

.

20

0

10

-24

0

-48

-30

0

30

Spectral overlap is suppressed

Equiangular filter for the spectrum of the driving function
Equiangular filter used in this presentation is cut by applying a low-pass
filter with respect to the frequency that exceeds the maximum
frequency
, and we do not reproduce the sound field.

Magnitude[dB]

20

is

Magnitude[dB]

This case that the angular
Result of applying the equiangular filter (2/2)
By applying the equiangular filter, we enable to suppress a disturbance
of wavefront and enable to reproduce the sound field to the specific
direction.
Synthesized wavefront (unfiltered)

Synthesized wavefront (filtered)

0.0
-1.5

0.0

1.5

-0.10

Spatial aliasing occurs

1.5
0.00
0.0
-1.5

0.0

1.5

Amplitude

0.00

Amplitude

1.5

0.10

3.0

0.10

3.0

-0.10

Disturbance of wavefront is suppressed

However, there is a problem that it is impossible to match the sweet spot
to the listener’s position if listener’s direction is unknown in advance.
Summary of problems
Problems in SDM
A sound pressure error occurs in the case that the reference
listening distance does not match listener's distance.
A spatial aliasing is occurred by discretization of secondary sources.
Second problem can be solved by applying an equiangular filter

Problems in equiangular filter
It is impossible to match the sweet spot to the listener’s position if
listener’s direction is unknown in advance.

These problems can be solved if we know the listener’s
position,
therefore, introduction of the image sensor enables to solve
these problems.
Condition of simulation experiment
Primary source (monopole source)

34 ch linear secondary
source array (monopole source)

Parameter name
measurement plane
aliasing frequency

Parameter value
W4.0 D4.0
approximately 2019 Hz

angular width
reproduced direction
Reference
listening line

synthesis frequency

3, 5 kHz

Evaluation score

: radiation characteristic of primary sound field
: radiation characteristic of secondary sound field

It is assumed that listener’s position is obtained by the image sensor, we calculate
the reproduced direction from sound source position and listener's position.
Results of simulation experiment
0.10

0.10

2.0
1.0

0.00

0.0
-1.0

-1.5

0.0

1.5 -0.10

2.0
1.0

0.00

0.0
-1.0
-1.5

0.0

Amplitude

Synthesized wavefront (5 kHz)

Amplitude

Synthesized wavefront (3 kHz)

1.5 -0.10

Evaluated value (3 kHz)

Evaluated value (5 kHz)

0

0
2.0

2.0
-24

0.0
-1.0

1.0

-24

-48

1.0

0.0

-48

-1.0
-1.5

0.0

1.5

-1.5

0.0

1.5
: Listener
: Primary source
Results of simulation experiment
0.10

0.10

2.0
1.0

0.00

0.0
-1.0

-1.5

0.0

1.5 -0.10

2.0
1.0

0.00

0.0
-1.0
-1.5

0.0

Amplitude

Synthesized wavefront (5 kHz)

Amplitude

Synthesized wavefront (3 kHz)

1.5 -0.10

Evaluated value (3 kHz)

Evaluated value (5 kHz)

0

0
2.0

2.0
-24

0.0
-1.0

1.0

-24

-48

1.0

0.0

-48

-1.0
-1.5

0.0

1.5

-1.5

0.0

1.5

The sound field is correctly reproduced
at listener’s direction regardless of the frequency.

: Listener
: Primary source
Condition of subjective assessment on directional perception
parameter name

Acoustic transparent
curtain

: Primary source
: Answer number card

parameter value

sampling frequency

48 kHz

quantization bit rate

16 bit

test sound

white Gaussian noise with 3 seconds

aliasing frequency
34 ch linear
loudspeaker array angular width

approximately 2019 Hz

sound source direction
number of evaluator
type of sound source

Loudspeaker
distance
Reference
listening line

7
・sound source without bandwidth limitation
(Conventional1)
・sound source with bandwidth limitation in
frequencies under 2 kHz (Conventional2)
・sound source in which we applied the
equiangular filter(Proposed)

Evaluation score
Pos 1
Pos 2

Pos 3
: number of evaluator

: answered direction

: true source direction

We asked evaluators to answer which card position you perceive the sound
source exists as an evaluation procedure.
Results of subjective assessment on directional perception
Conventional1 (without bandwidth limitation)
Conventional2 (with bandwidth limitation in frequencies under 2 kHz)
Proposed (in which we applied the equiangular filter)

Bad

(a) In Pos1

(b) In Pos2

(c) In Pos3

Good
Proposed is superior to Conventional1 and Conventional2 in Pos1 and Pos2.
However, Proposed is almost the same as Conventional2 in Pos3.
This is because in equiangular filter, as the angle of reproduced direction becomes
larger, the maximum frequency becomes low.

As the user moves to right (from Pos1 to Pos3), directional perception error of
Conventional1 becomes larger owing to the effect of a spatial aliasing.
The superiority of the proposed method is shown on directional perception.
Condition of subjective assessment on sound quality

Acoustic transparent
curtain

: Primary source
: Reference loudspeaker

parameter name

parameter value

sampling frequency

34 ch linear
loudspeaker array

48 kHz

quantization bit rate

16 bit

test sound
aliasing frequency

White Gaussian noise with 3 seconds
approximately 2019 Hz

angular width

Loudspeaker
distance

sound source direction
number of evaluator
type of sound source

Reference
listening line

Pos 1
Pos 2

Pos 3

7
・sound source without bandwidth
limitation (Conventional1)
・sound source with bandwidth limitation
in frequencies under 2 kHz
(Conventional2)
sound source in which we applied the
equiangular filter(Proposed)

We sounded two synthesized sound after reference sound radiated by reference
loudspeaker, and asked evaluators to answer which synthesized sound you
perceive closer to the reference sound as an evaluation procedure.
Results of subjective assessment on sound quality
Conventional1 (without bandwidth limitation)
Conventional2 (with bandwidth limitation in frequencies under 2 kHz)
Proposed (in which we applied the equiangular filter)

Good
(a) In Pos1

(b) In Pos2

(c) In Pos3

ꥰꥰ

Bad
In all results, evaluators chose Conventional1 or Proposed, and didn’t
choose Conventional2.
In all listener’s position, more evaluator chose Conventional1 than
Proposed.
It was suggested that the effect in which high frequency region of sound is
cut is larger than the effect of spatial aliasing on sound quality.
Conclusion
The objective of SFR system is to reproduce the primary sound field to
another space with wide range and high accuracy as much as possible.
Since it is difficult to reproduce the sound field with a complex system, the
SFR method utilizing simple system has been desired.

SDM can be realized with a simple system like linear loudspeaker array.
However, to reproduce the sound field with high accuracy utilizing this
method is impossible.
ꥰꥰ

We proposed the SFR system which reproduce the sound field with high
accuracy to listener's position by estimating the listener's direction.

As results of subjective assessment, the superiority of proposed
method is shown on directional perception.
However, since the superiority failed to show on sound quality, it is
necessary to improve the equiangular filter that we do not apply the lowpass filter.

Thank you for your attention!

More Related Content

PDF
Depth Estimation of Sound Images Using Directional Clustering and Activation...
PDF
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
PPTX
DNN-based frequency component prediction for frequency-domain audio source se...
PPTX
Koyama ASA ASJ joint meeting 2016
PPTX
Linear multichannel blind source separation based on time-frequency mask obta...
PPTX
Blind audio source separation based on time-frequency structure models
PPTX
Depth estimation of sound images using directional clustering and activation-...
PPTX
Prior distribution design for music bleeding-sound reduction based on nonnega...
Depth Estimation of Sound Images Using Directional Clustering and Activation...
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
DNN-based frequency component prediction for frequency-domain audio source se...
Koyama ASA ASJ joint meeting 2016
Linear multichannel blind source separation based on time-frequency mask obta...
Blind audio source separation based on time-frequency structure models
Depth estimation of sound images using directional clustering and activation-...
Prior distribution design for music bleeding-sound reduction based on nonnega...

What's hot (20)

PDF
Ica2016 312 saruwatari
PDF
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color Images
PPTX
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
PDF
Dsp2015for ss
PPTX
Hybrid NMF APSIPA2014 invited
PDF
Apsipa2016for ss
PPTX
Koyama AES Conference SFC 2016
PPTX
DNN-based permutation solver for frequency-domain independent component analy...
PDF
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
PDF
Isolated words recognition using mfcc, lpc and neural network
PPTX
Online divergence switching for superresolution-based nonnegative matrix fact...
PDF
A computer vision approach to speech enhancement
PDF
APPRAISAL AND ANALOGY OF MODIFIED DE-NOISING AND LOCAL ADAPTIVE WAVELET IMAGE...
PDF
Sound Source Localization with microphone arrays
PDF
3D Audio playback for single channel audio using visual cues
PPTX
Divergence optimization in nonnegative matrix factorization with spectrogram ...
PPTX
Blind source separation based on independent low-rank matrix analysis and its...
PDF
Wavelet based image fusion
PDF
Adaptive noise estimation algorithm for speech enhancement
PDF
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
Ica2016 312 saruwatari
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color Images
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Dsp2015for ss
Hybrid NMF APSIPA2014 invited
Apsipa2016for ss
Koyama AES Conference SFC 2016
DNN-based permutation solver for frequency-domain independent component analy...
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Isolated words recognition using mfcc, lpc and neural network
Online divergence switching for superresolution-based nonnegative matrix fact...
A computer vision approach to speech enhancement
APPRAISAL AND ANALOGY OF MODIFIED DE-NOISING AND LOCAL ADAPTIVE WAVELET IMAGE...
Sound Source Localization with microphone arrays
3D Audio playback for single channel audio using visual cues
Divergence optimization in nonnegative matrix factorization with spectrogram ...
Blind source separation based on independent low-rank matrix analysis and its...
Wavelet based image fusion
Adaptive noise estimation algorithm for speech enhancement
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
Ad

Viewers also liked (6)

PDF
Bachelorthesis Robert Ballon Intelligentes Mikrofonsystem - Einsatzszenarien ...
PPTX
3 D Sound
PDF
SIG-Audio#13 GDC2016オーディオ報告会「出展ブースからみるGDC」
KEY
Ambisonics: Getting the Best Surround Around
KEY
Spatial Sound parts 1 & 2
KEY
Spatial Sound 3: Audio Rendering and Ambisonics
Bachelorthesis Robert Ballon Intelligentes Mikrofonsystem - Einsatzszenarien ...
3 D Sound
SIG-Audio#13 GDC2016オーディオ報告会「出展ブースからみるGDC」
Ambisonics: Getting the Best Surround Around
Spatial Sound parts 1 & 2
Spatial Sound 3: Audio Rendering and Ambisonics
Ad

Similar to Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor (20)

PDF
Dsp final report
PDF
A Generalized Wave Field Synthesis Theory (Ph.D. Thesis Booklet)
PDF
An Overview of Array Signal Processing and Beam Forming TechniquesAn Overview...
PPTX
Defense - Sound space rendering based on the virtual Sound space rendering ba...
PDF
1 s2.0-s1877705814009333-main
PPTX
Beamforming and microphone arrays
PDF
Catalogue 2013 en
PDF
Source localization using vector sensor array
PDF
QRC-ESPRIT Method for Wideband Signals
PDF
Improving the global parameter signal to distortion value in music signals
PDF
Improving the global parameter signal to distortion value in music signals
PDF
Improving the global parameter signal to distortion value in music signals
PDF
REDI - Recording Studio Design Optimization.pdf
PDF
Automatic mic adjustment using dc motor
PDF
PDF
Speaker_and_Enclosure_intro.pdf by manas speakers
PDF
Active noise control real time demo
PPTX
A systematic examination of 2-D signals and systems
PDF
Decimation and Interpolation
PPTX
Antenna Arrays Modified power point presentation
Dsp final report
A Generalized Wave Field Synthesis Theory (Ph.D. Thesis Booklet)
An Overview of Array Signal Processing and Beam Forming TechniquesAn Overview...
Defense - Sound space rendering based on the virtual Sound space rendering ba...
1 s2.0-s1877705814009333-main
Beamforming and microphone arrays
Catalogue 2013 en
Source localization using vector sensor array
QRC-ESPRIT Method for Wideband Signals
Improving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signals
REDI - Recording Studio Design Optimization.pdf
Automatic mic adjustment using dc motor
Speaker_and_Enclosure_intro.pdf by manas speakers
Active noise control real time demo
A systematic examination of 2-D signals and systems
Decimation and Interpolation
Antenna Arrays Modified power point presentation

More from 奈良先端大 情報科学研究科 (20)

PPTX
テレコミュニケーションを支援してみよう
PPTX
マイコンと機械学習を使って行動認識システムを作ろう
PPTX
5G時代を支えるNFVによるネットワーク最適設計
PPTX
21.Raspberry Piを用いたIoTアプリの開発
PPTX
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
PPTX
11.実装の脆弱性を利用して強力な暗号を解読してみよう!
PPTX
8. ミニ・スーパコンピュータを自作しよう!
PPTX
16. マイコンと機械学習を使って行動認識システムを作ろう
PPTX
15. テレイグジスタンスシステムを制作してみよう
PPTX
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
PPTX
19. 生物に学ぶ人工知能とロボット制御
PPTX
13. SDRで学ぶ無線通信
PPTX
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
PPTX
21. 人の動作・行動センシングに基づく拡張現実感システムの開発
PPTX
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
PPTX
9. マイコンと機械学習を使って行動認識システムを作ろう
PPTX
6. 生物に学ぶ人工知能とロボット制御
PPTX
14. モバイルエージェントによる並列分散学習システムの構築
PPTX
17. 100台の小型ロボットを協調させよう
PPTX
5. ミニ・スーパコンピュータを自作しよう!
テレコミュニケーションを支援してみよう
マイコンと機械学習を使って行動認識システムを作ろう
5G時代を支えるNFVによるネットワーク最適設計
21.Raspberry Piを用いたIoTアプリの開発
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
11.実装の脆弱性を利用して強力な暗号を解読してみよう!
8. ミニ・スーパコンピュータを自作しよう!
16. マイコンと機械学習を使って行動認識システムを作ろう
15. テレイグジスタンスシステムを制作してみよう
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
19. 生物に学ぶ人工知能とロボット制御
13. SDRで学ぶ無線通信
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
21. 人の動作・行動センシングに基づく拡張現実感システムの開発
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
9. マイコンと機械学習を使って行動認識システムを作ろう
6. 生物に学ぶ人工知能とロボット制御
14. モバイルエージェントによる並列分散学習システムの構築
17. 100台の小型ロボットを協調させよう
5. ミニ・スーパコンピュータを自作しよう!

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Empathic Computing: Creating Shared Understanding
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
Advanced methodologies resolving dimensionality complications for autism neur...
MIND Revenue Release Quarter 2 2025 Press Release
Dropbox Q2 2025 Financial Results & Investor Presentation
Per capita expenditure prediction using model stacking based on satellite ima...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Empathic Computing: Creating Shared Understanding
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The Rise and Fall of 3GPP – Time for a Sabbatical?
Review of recent advances in non-invasive hemoglobin estimation
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MYSQL Presentation for SQL database connectivity
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
20250228 LYD VKU AI Blended-Learning.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Reach Out and Touch Someone: Haptics and Empathic Computing

Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor

  • 1. Robust Sound Field Reproduction against Listener’s Movement Utilizing Image Sensor Toshihide Aketo,Hiroshi Saruwatari,Satoshi Nakamura (Nara Institute of Science and Technology, Japan)
  • 2. Outline Research background Conventional method Spectral Division Method Local sound field synthesis Proposed method Equiangular filter Sound field reproduction system utilizing image sensor Simulation experiment Subjective assessment on directional perception on sound quality
  • 3. Research background (1/3) Objective of sound field reproduction (SFR) system To reproduce the primary sound field to another space with wide range and high accuracy. However, it is difficult to realize such a system because the system size becomes larger and the system configuration becomes complex. Therefore, the recent research is focused on reproducing sound field with wide range and high accuracy using small and simple system. Surrounded (large and complex) Circular or spherical (a little complex) Linear or planer (simple) Boundary surface control (BoSC) Ambisonics Stereo or surround system Wave field synthesis (WFS) Focused Complex Simple
  • 4. Research background (2/3) Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008] One of the SFR methods that reproduces the sound field by synthesizing a number of wavefronts. This method can be realized with a simple system like linear loudspeaker array. However, SDM has two problems. Problem 1: A sound pressure error is occurred by mismatching the reference listening line. Problem 2: A disturbance of wavefront is occurred by a spatial aliasing. Reproduction accuracy: Low Reproduction region: Wide High We aim to reproduce the sound field with high accuracy by solving these problems in SDM.
  • 5. Research background (3/3) To cope with these problems, we propose the novel SFR system with linear loudspeaker array, which combines listener’s position estimation by Kinect and SDM with local sound field synthesis. Image sensor Kinect Local sound field synthesis Reproduction accuracy Low Reproduction region: Wide Reproduction accuracy: High Reproduction region: localized around listener
  • 6. Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008] Primary source Primary source nth secondary source nth secondary source Reference listening line Reference listening line Spatial domain IDFT Fourier transform Wavenumber domain The driving function in the wavenumber domain The driving function in the spatial domain : angular frequency : wavenumber in : speed of sound -direction : imaginary unit : reference listening distance : zero-th order modified Bessel function of the second kind : zero-th order Hankel function of the second kind
  • 7. Spectral Division Method (SDM) [J. Ahrens, S. Spors., 2008] Primary source Primary source nth secondary source nth secondary source Reference listening line Reference listening line Spatial domain IDFT Fourier transform Wavenumber domain The driving function in the wavenumber domain The driving function in the spatial domain :reference listening distance Problems in SDM A sound pressure error is occurred by mismatching the reference listening line. A disturbance of wavefront is occurred by a spatial aliasing.
  • 8. Problem 1 : sound pressure error A sound pressure is correctly reproduced only on the reference listening line under 2.5-dimensional synthesis condition. Sound pressure is correctly reproduced on the reference listening line. 2.0 2.0 1.0 1.0 0.0 0.0 -1.0 0.0 1.0 Primary sound field -1.0 0.0 Sound pressure error 1.0 occurs outside the reference listening line. Reproduced sound field Therefore, to correctly reproduce the sound field to listener's position, we must set the reference listening distance equal to listener's distance.
  • 9. Problem 2: spatial aliasing (1/2) 0 10 -24 0 -48 0 R参 加 -30 30 20 0 10 -24 0 -48 -30 0 30 Spectral overlap occurs Discretization of the secondary source Magnitude[dB] 20 Magnitude [dB] In SDM, a spectral overlap of the driving function is occurred by discretization of secondary source, and filter power at high frequency becomes larger like in the right figure.
  • 10. Problem 2: spatial aliasing (2/2) The effect of spectral overlap in the wavenumber domain appears as a spatial aliasing in the spatial domain. 1.5 0.00 0.0 -1.5 0.0 1.5 -0.10 3.0 Synthesized wavefront (discrete array) 0.10 1.5 0.00 0.0 -1.5 0.0 1.5 Amplitude 0.10 Amplitude 3.0 Synthesized wavefront (continuous array) -0.10 Disturbance of wavefront occurs Discretization of the secondary source
  • 11. Local sound field synthesis (1/2) [J. Ahrens, S. Spors., 2011] 0 10 -24 0 -48 -30 0 30 Spectral overlap occurs 20 0 10 -24 0 -48 -30 0 30 Spectral overlap is suppressed Rectangular window for the spectrum of the driving function By applying a rectangular window to a spectrum in the left figure, we enable to suppress a spectral overlap like in the right figure. Magnitude[dB] 20 Magnitude[dB] Local sound field synthesis: the method enables to suppress a spatial aliasing by limiting spatial bandwidth in the wavenumber domain.
  • 12. Local sound field synthesis (2/2) [J. Ahrens, S. Spors., 2011] By applying a rectangular window, we enable to suppresses a disturbance of wavefront and enable to increase the maximum frequency in which the sound field can be correctly reproduced. Synthesized wavefront (unfiltered) Synthesized wavefront (filtered) 0.0 -1.5 0.0 1.5 -0.10 Spatial aliasing occurs 1.5 0.00 0.0 -1.5 0.0 1.5 Amplitude 0.00 Amplitude 1.5 0.10 3.0 0.10 3.0 -0.10 Disturbance of wavefront is suppressed Reproduction area is localized Therefore, It is necessary to design a filter to precisely control the reproduced direction in order to take advantage of this method.
  • 13. Equiangular filter In order to design a filter to accurately control the reproduced direction, we derive the relation equation between reproduced direction , wavenumber in -direction and frequency . constant proportional : wavenumber in -direction : speed of sound :reproduced direction : frequency If reproduced direction is constant, since it is found that proportional to , we design a new filter as follows : angular frequency : angular width : wavenumber : equiangular filter is
  • 14. Result of applying the equiangular filter (1/2) An example when we applied a designed filter to a spectrum 0 10 -24 0 -48 -30 0 30 Spectral overlap occurs and the angular width is . 20 0 10 -24 0 -48 -30 0 30 Spectral overlap is suppressed Equiangular filter for the spectrum of the driving function Equiangular filter used in this presentation is cut by applying a low-pass filter with respect to the frequency that exceeds the maximum frequency , and we do not reproduce the sound field. Magnitude[dB] 20 is Magnitude[dB] This case that the angular
  • 15. Result of applying the equiangular filter (2/2) By applying the equiangular filter, we enable to suppress a disturbance of wavefront and enable to reproduce the sound field to the specific direction. Synthesized wavefront (unfiltered) Synthesized wavefront (filtered) 0.0 -1.5 0.0 1.5 -0.10 Spatial aliasing occurs 1.5 0.00 0.0 -1.5 0.0 1.5 Amplitude 0.00 Amplitude 1.5 0.10 3.0 0.10 3.0 -0.10 Disturbance of wavefront is suppressed However, there is a problem that it is impossible to match the sweet spot to the listener’s position if listener’s direction is unknown in advance.
  • 16. Summary of problems Problems in SDM A sound pressure error occurs in the case that the reference listening distance does not match listener's distance. A spatial aliasing is occurred by discretization of secondary sources. Second problem can be solved by applying an equiangular filter Problems in equiangular filter It is impossible to match the sweet spot to the listener’s position if listener’s direction is unknown in advance. These problems can be solved if we know the listener’s position, therefore, introduction of the image sensor enables to solve these problems.
  • 17. Condition of simulation experiment Primary source (monopole source) 34 ch linear secondary source array (monopole source) Parameter name measurement plane aliasing frequency Parameter value W4.0 D4.0 approximately 2019 Hz angular width reproduced direction Reference listening line synthesis frequency 3, 5 kHz Evaluation score : radiation characteristic of primary sound field : radiation characteristic of secondary sound field It is assumed that listener’s position is obtained by the image sensor, we calculate the reproduced direction from sound source position and listener's position.
  • 18. Results of simulation experiment 0.10 0.10 2.0 1.0 0.00 0.0 -1.0 -1.5 0.0 1.5 -0.10 2.0 1.0 0.00 0.0 -1.0 -1.5 0.0 Amplitude Synthesized wavefront (5 kHz) Amplitude Synthesized wavefront (3 kHz) 1.5 -0.10 Evaluated value (3 kHz) Evaluated value (5 kHz) 0 0 2.0 2.0 -24 0.0 -1.0 1.0 -24 -48 1.0 0.0 -48 -1.0 -1.5 0.0 1.5 -1.5 0.0 1.5 : Listener : Primary source
  • 19. Results of simulation experiment 0.10 0.10 2.0 1.0 0.00 0.0 -1.0 -1.5 0.0 1.5 -0.10 2.0 1.0 0.00 0.0 -1.0 -1.5 0.0 Amplitude Synthesized wavefront (5 kHz) Amplitude Synthesized wavefront (3 kHz) 1.5 -0.10 Evaluated value (3 kHz) Evaluated value (5 kHz) 0 0 2.0 2.0 -24 0.0 -1.0 1.0 -24 -48 1.0 0.0 -48 -1.0 -1.5 0.0 1.5 -1.5 0.0 1.5 The sound field is correctly reproduced at listener’s direction regardless of the frequency. : Listener : Primary source
  • 20. Condition of subjective assessment on directional perception parameter name Acoustic transparent curtain : Primary source : Answer number card parameter value sampling frequency 48 kHz quantization bit rate 16 bit test sound white Gaussian noise with 3 seconds aliasing frequency 34 ch linear loudspeaker array angular width approximately 2019 Hz sound source direction number of evaluator type of sound source Loudspeaker distance Reference listening line 7 ・sound source without bandwidth limitation (Conventional1) ・sound source with bandwidth limitation in frequencies under 2 kHz (Conventional2) ・sound source in which we applied the equiangular filter(Proposed) Evaluation score Pos 1 Pos 2 Pos 3 : number of evaluator : answered direction : true source direction We asked evaluators to answer which card position you perceive the sound source exists as an evaluation procedure.
  • 21. Results of subjective assessment on directional perception Conventional1 (without bandwidth limitation) Conventional2 (with bandwidth limitation in frequencies under 2 kHz) Proposed (in which we applied the equiangular filter) Bad (a) In Pos1 (b) In Pos2 (c) In Pos3 Good Proposed is superior to Conventional1 and Conventional2 in Pos1 and Pos2. However, Proposed is almost the same as Conventional2 in Pos3. This is because in equiangular filter, as the angle of reproduced direction becomes larger, the maximum frequency becomes low. As the user moves to right (from Pos1 to Pos3), directional perception error of Conventional1 becomes larger owing to the effect of a spatial aliasing. The superiority of the proposed method is shown on directional perception.
  • 22. Condition of subjective assessment on sound quality Acoustic transparent curtain : Primary source : Reference loudspeaker parameter name parameter value sampling frequency 34 ch linear loudspeaker array 48 kHz quantization bit rate 16 bit test sound aliasing frequency White Gaussian noise with 3 seconds approximately 2019 Hz angular width Loudspeaker distance sound source direction number of evaluator type of sound source Reference listening line Pos 1 Pos 2 Pos 3 7 ・sound source without bandwidth limitation (Conventional1) ・sound source with bandwidth limitation in frequencies under 2 kHz (Conventional2) sound source in which we applied the equiangular filter(Proposed) We sounded two synthesized sound after reference sound radiated by reference loudspeaker, and asked evaluators to answer which synthesized sound you perceive closer to the reference sound as an evaluation procedure.
  • 23. Results of subjective assessment on sound quality Conventional1 (without bandwidth limitation) Conventional2 (with bandwidth limitation in frequencies under 2 kHz) Proposed (in which we applied the equiangular filter) Good (a) In Pos1 (b) In Pos2 (c) In Pos3 ꥰꥰ Bad In all results, evaluators chose Conventional1 or Proposed, and didn’t choose Conventional2. In all listener’s position, more evaluator chose Conventional1 than Proposed. It was suggested that the effect in which high frequency region of sound is cut is larger than the effect of spatial aliasing on sound quality.
  • 24. Conclusion The objective of SFR system is to reproduce the primary sound field to another space with wide range and high accuracy as much as possible. Since it is difficult to reproduce the sound field with a complex system, the SFR method utilizing simple system has been desired. SDM can be realized with a simple system like linear loudspeaker array. However, to reproduce the sound field with high accuracy utilizing this method is impossible. ꥰꥰ We proposed the SFR system which reproduce the sound field with high accuracy to listener's position by estimating the listener's direction. As results of subjective assessment, the superiority of proposed method is shown on directional perception. However, since the superiority failed to show on sound quality, it is necessary to improve the equiangular filter that we do not apply the lowpass filter. Thank you for your attention!