KAMERA Data Prediction

Pei-Yuan Sung
Yi-Ting Wang
KAMERA 急診資料挑戰賽
成果分享

清大統計所畢業
品質管理工程師
熱愛寫程式與分析資料
清大統計所畢業
Big data 工程師
熱愛機器學習統計分析

Outline
 Introduction & Data Source
 Data Inspector
 Variable Finding
 Method

Introduction & Data Source
 Problem
 20130101 ~ 20131231期間, KAMERA內部分醫院
的急診部門營運資料
 Deliverables
 利用gee method 及分群去預測20140811-
20140817的各級檢傷人數總和

Take a look at Data
37個predictor variable
3個 variable
Training data
predicting data

 選擇出對Total解釋能力好的變數
 利用data visualization大致選出組間變異大; 組內變異小的
variable

variable
 新增好的變數
 利用K-means將total分成六群，以此作為一個新的variable
 此variable對於total也具有組間變異大; 組內變異小的特性

variable
 模型的選取
 使用generalized estimating equation (GEE)

variable
 模型的選取
 使用generalized estimating equation (GEE)
 對Total加上Weight再fit GEE
 利用K-means所做出的分群，分別對total加上weight

Data Inspector
 從各級檢傷人數總和的 boxplot 可看出其高人數群很多
total
Min 0
1st Qu 8
Median 18
Mean 37
3rd Qu 54
Max 226

Hospital VS Time
Hospital1
的所有時間
區間
Back

Variable Finding
 使用 K-means clustering 將 total 分成六群, 讓其組
間差異較大, 組內差異較小
Back

Method : GEE
 GEE: generalized estimating equation
 主要針對重複測量的資料 (群組資料間具有相關性)
 變數的使用(如下表)
 資料分法:
 在相同月份下, 同一個醫院編號及同一個資料記錄時間區間當成
一群
有使用變數意義
total 各級檢傷人數總和
date 資料記錄之日期 (月份)
tz 資料記錄之時間區間,每四小時一個區間
Hospital_PK 醫院編號
group 對 total 做分群的結果
tz : Hospital 交互作用影響顯著

Covariance matrix的使用與選擇
 Covariance matrix 選擇: 使用 AR(1)
 原先使用同一個醫院編號分群, 但 covariance matrix 太大
 同一個醫院編號同一個月分下做分群, covariance matrix 還是太大
 最後使用同一個醫院編號同一個月分同一個時間區間下當成一群
𝒚𝒊𝟏 𝒚𝒊𝟐 𝒚𝒊𝟑 𝒚𝒊𝟒
𝒚𝒊𝟏 𝜎2
𝜎2
ρ 𝜎2
𝜌2
𝜎2
𝜌3
𝒚𝒊𝟐 𝜎2
𝜎2
ρ 𝜎2
𝜌2
𝒚𝒊𝟑 𝜎2 𝜎2ρ
𝒚𝒊𝟒 𝜎2

最終模型簡介
 變數使用: Total ~ 1 + Month + Time + Hospital +
Group + Time:Hospital
 Covariance Matrix : AR1
 Family : Poisson link Log function
Back

Weight selection
 Index
 Weight 根據K-means分群去加
 Total < 27 0.9
 27 < Total < 46 0.95
 90 < Total < 140  1.1
 Total > 140  1.15

Weight selection
 Cross Validation (90% for training; 10% for testing
data)
Training data (90%) Testing data (10%)
Repeat : 100 times and then take mean of the ind

KAMERA Data Prediction

More Related Content

Featured (20)

KAMERA Data Prediction