Think bayes

Frequentist VS Bayesian
Integration issue

ThinkBayes
Introduction of bayesian inference

김진섭
서울대학교 보건대학원

January 14, 2014

김진섭

ThinkBayes

Integration issue

목차

1

확률을 보는 관점
Bayes’ rule

2

Integration issue
Why?
Simulation

김진섭

ThinkBayes

Integration issue

Bayes’ rule

객관적 VS 주관적 확률

주사위를 던져 1이 나올 확률
1

객관적: 확률은 정확한 숫자로 존재하고 그것을 추정한다.

2

주관적: 알수 없다, 믿음을 계속 업데이트할 수 밖에..

주사위를 던져 1이 나올 확률에 대한 접근법
1

객관적: 계속 던져봐서 추정해보니 확률은 1/6인 듯 하다.

2

주관적: 1/6일 것 같은데, 계속 던져보니 1/6이 맞는 것
같네..

김진섭

ThinkBayes

Integration issue

Bayes’ rule

Homo bayesianis

Figure : Fun example of bayesian

김진섭

ThinkBayes

Integration issue

Bayes’ rule

Frequentist의 논쟁법

상대방: 신약이랑 기존 약이랑 혈압강하효과가 차이가 없는 것
같은데..
나: 뭐? 신약이랑 기존 약이랑 차이가 0이라고?? 차이가 0
이라고 치자. 그러면 어쩌구저쩌구.. 이 데이터의 상황이 나올
가능성이 거의 없는데(5%미만인데)? 그니까 넌 틀렸어.
1

차이가 0이라고 말한 사람은 없다. 가상의적을 난타.

2

상대방의 주장을 최대한 좁게 해석하여 반박.

3

얍삽하다.

김진섭

ThinkBayes

Integration issue

Bayes’ rule

Bayesian의 논쟁법

상대방: 신약이랑 기존 약이랑 혈압강하효과가 차이가 없는 것
같은데.. N(0, 1)분포를 따르지 않을까?
나: 차이가 N(0, 1)을 따른다고 가정하자. 가정에 따르면 이
데이터의 상황이 주어졌을 때, 차이의 조건부확률을
계산해보니 N(5, 1.2)를 따르는데?
1

사전믿음에 대한 분포를 가정: Prior

2

데이터가 주는 정보: Likelihood

3

믿음과 데이터의 정보를 종합 : Posterior- 이걸로 해석.

김진섭

ThinkBayes

Integration issue

Bayes’ rule

Prior, likelihood, posterior

Figure : Prior, likelihood, posterior
김진섭

ThinkBayes

Integration issue

Why?
Simulation

1

Posterior 분포를 그려야 평균 or 95% C.I....

2

Prior와 Likelihood가 적당히 좋은 함수라면 Posterior가 잘
알고 있는 분포가 될 수도..

3

대부분은 Posterior는 알고 있는 분포가 아니다...
적분불가능.

김진섭

ThinkBayes

Integration issue

Why?
Simulation

Monte Carlo integration

1

적분을 시뮬레이션으로 해결하겠다.

2

예) N(0,1) 적분 : N(0,1)에서 sample N개 뽑아서 그것의
평균, N이 커지면 원래 적분값에 가까워짐.

즉, f (x) 적분할 때 f (x)에서 샘플링 많이 해서 그것의
평균으로..
f (x) 샘플링 어려울 땐 비슷하게 생긴 g (x)이용 : Importance
sampling

김진섭

ThinkBayes

Integration issue

Why?
Simulation

Monte carlo example

Figure : 원의 넓이 구하기

김진섭

ThinkBayes

Integration issue

Why?
Simulation

Monte carlo example(2)

Figure : Integration of f (x)

김진섭

ThinkBayes

Integration issue

Why?
Simulation

MCMC(Markov chain Monte Carlo)

1

Monte carlo: Random sampling- 효율이 떨어짐.

2

다변량 분석, 특히 multilevel 샘플링 어렵다.

MCMC
1

Markov chain MC : 바로 전의 샘플링한것을 이용하여
sampling - 효율, hierarchial model에 적합.

2

Metropolis-Hastings 알고리즘, Gibbs sampler(거의 표준)

김진섭

ThinkBayes

Integration issue

Why?
Simulation

Gibbs sampler

Figure : Example of gibbs sampling: 2 variables

김진섭

ThinkBayes

Integration issue

Why?
Simulation

Multivariable gibbs sampler

Figure : Gibbs sampling: > 2 variables

김진섭

ThinkBayes

Think bayes

More Related Content

More from Jinseob Kim (20)

Think bayes