SlideShare a Scribd company logo
國立臺北護理健康大學 NTUHS
Data Envelopment Analysis (DEA)
Orozco Hsu
2021-06-10
1
About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2
「How can you not get romantic about baseball ? 」
Tutorial
Content
3
Linear Programming (Maximizing Profit)
DEA introduction
Homework
Operation Research
Code
• Download code
• https://github.com/orozcohsu/ntunhs_2020/tree/master/alg_20210610
4
Quiz 1
• 1. Operations research is the application of ________methods to
arrive at the optimal Solutions to the problems.
• A. economical
• B. scientific
• C. artistic
• D. a and b both
5
Quiz 2
• 2. Operations research is based upon collected information,
knowledge and advanced study of various factors impacting a
particular operation. This leads to more informed ________.
• A. Management processes
• B. Decision making
• C. Procedures
• D. Machine learning
6
Quiz 3
• 3. What is the objective function in linear programming problems?
• A. A constraint for available resource
• B. An objective for research and development of a company
• C. A linear function in an optimization problem
• D. A set of non-negativity conditions
7
Quiz 4
• About the Data Envelopment Analysis which is TRUE?
• A. We can also use regression line to analysis inefficiency DMUs
• B. In CCR model of input oriented, that means we want to maximum the
outputs
• C. Compare with CRS and VRS, CSR has more benchmarks (efficiency score=1)
• D. Data envelopment analysis may generate inaccurate efficiency scores,
especially when the sample size is small relative to the number of inputs and
outputs
8
Operation Research
9
Operation Research
• The field of operations research
provides QUANTITATIVE methods to
solve problems including MAXIMIZE
profits or MINIMIAZE losses,
investigating the outcomes under
fluctuating market conditions, and to
facilitate decision making.
• A new field which started in the late
1930’s and has grown and expanded
tremendously in the last 30 years.
10
Operation Research
11
(Simulation)
Operation Research
• Operations Research in one word
• OPTIMIAZTION
• Operations Research in one sentence
• DO THE THINGS BEST UNDER CONSTRANITS.
12
https://en.wikipedia.org/wiki/Convex_optimization
Convex optimization is a subfield of mathematical optimization
that studies the problem of minimizing convex functions over
convex sets. Many classes of convex optimization problems
admit polynomial-time algorithms, whereas mathematical
optimization is in general NP-hard
Operation Research
• Three main problem classes in OR
• Mathematical programming
• Linear Programming with objective
function and constraints.
• Numerical optimization
• Convex Optimization: Gradient-based
• Non-gradient: Bayesian, Genetic
Algorithms…etc.
• Simulation
• Used to repeated random sampling and
obtains numerical result to approximate a
probability distribution , like MCMC
(Model Carlo Markov Chain)
13
https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo
Newton method
Tangent line
Operation Research
• Linear Programming
• Translate into a model.
• A simple technique where we depict
complex relationships through linear
functions and then find the optimum points.
14
Operation Research
• Nonlinear Programming
• The process of solving an optimization problem where
some of the constraints or the objective function are
nonlinear.
• Objective function
• CONCAVE (maximization problem)
• CONVEX (minimization problem)
15
Maximize: f(x) = x1 + x2
Subject to:
x1 ≥ 0
x2 ≥ 0
x1
2 + x2
2 ≥ 1
x1
2 + x2
2 ≤ 2
Operation Research
• The gradient decent optimization
• Gradient descent is A FIRST-ORDER iterative
optimization algorithm for finding a LOCAL
minimum of a differentiable function.
• The idea is to take repeated steps in the OPPOSITE
DIRECTION of the gradient of the function at the
current point, because this is the direction of
steepest descent.
16
For bigdata consideration, we used to use mini-batch for gradient decent (SGD).
https://towardsdatascience.com/batch-mini-batch-stochastic-gradient-descent-7a62ecba642a
Operation Research
17
• Linear Programming
• Object function is linear
• Non-Linear Programming
• Object function is non-linear (convex)
• Convex
• Global optimization
• Non-Convex
• Local optimization
• Time complexity
• Polynomial time
• NP complete
Operation Research
• Monte Carlo: Simulation to draw
quantities of interest from the distribution.
• Markov Chain: Stochastic process in which
future states are independent of past state
given the present state.
• MCMC: A class of method in which we can
simulate draw that are slightly dependent
and are approximately from posterior
distribution.
18
https://www.youtube.com/watch?v=R9NQY2Hyl14
Linear Programming (Maximizing Profit)
19
Linear Programming (Maximizing Profit)
• Linear Programming (LP) and the Simple algorithm has been around
for decades.
• It was first introduced in the U.S. Air Force for helping with strategical
planning back in the 40s.
• Ever since then, many industries are taking advantage of it to
maximize profit and minimize cost, among other things.
• Often used in process of data-driven decision making.
20
https://arxiv.org/ftp/cs/papers/0611/0611008.pdf
*Why LP cannot solve large instances of NP-complete problems in polynomial time?
(convex optimization)
Linear Programming (Maximizing Profit)
• Objective function
• Maximize or Minimize
• Constrains
• linear inequalities
Maximize: 3x + 2y
Subject to: y < 20 – 6x
y < 12 – 2x
y < 10 -x
21
Linear Programming (Maximizing Profit)
• 2D Process:
• (K: Capital, L: Labor)
• Constrains:
• GH
• Best solution:
• At point E (OJN)
• Feasible region reaches the isoquant
for 200Q (the highest possible)
• Take action:
• 200 units of output with process 2
by using 8L and 8K
K/L =2
K/L =1
K/L =0.5
200Q = D(6L, 12K), E(8L, 8K), F(12L, 6K)
22
Linear Programming (Maximizing Profit)
• Mix of constrains
• Find out the maximize point
2D line intersects
graphic_solution.ipynb
23
Linear Programming (Maximizing Profit)
• This demonstrates why we don't try to
GRAPH the FEASIBLE REGION when
there are more than two decision
variables.
• Three dimensional graphs aren't that
easy to draw and you can forget about
making the sketch when there are four
or more decision variables.
24
Linear Programming (Maximizing Profit)
• Problem we need to solve
• The company produces four furniture items: chairs, tables, desks,
and bookcases.
• By improving the operations of the firm and its resources allocation, we can
potentially maximize the profit
5 board 10 man-hours 3 ounces 4 square fee of leather
25
Linear Programming (Maximizing Profit)
• The bottom neck is that all these material have the following total
quantities available, every week
• Our task is to decide how to better allocate these resources together
in order to make the most profit.
Material item quantities
Board 20,000
Man-hours 4,000
Ounces of glue 2,000
Square feet of leather 3,000
Square feet of glass 500
Product Profit($)
Chair 45
Table 80
Desk 110
Bookcase 55
26
Linear Programming (Maximizing Profit)
• This is the Maximization LP (Linear Programming) problem
• c for chair, t for table, d for desk, and b for bookcase
constraints
resources available 27
Linear Programming (Maximizing Profit)
• Optimal profit
•$20,857
• Chair
•57
• Table
•228
28
Activity-Analysis-Problem.ipynb
DEA introduction
29
DEA introduction
• Question:
• How to evaluate the efficiency of those 3
hospitals?
• Answer:
• You may refer to many indicators such like
Occupancy Rate, Average Length of Stay, Days of
Stay…etc.
• Each hospital has different indicators, but how
to evaluate?
• Use scatter plot that can help, but only works
with maximum 3 indicators in visualization.
30
Decision Making Unit (DMU): It represents the basic unit of analysis.
In this case, 10 hospitals equals to 10 DMU.
DEA introduction
• Data Envelopment Analysis (DEA)
• Consider multiple efficiency indicators, that can analyze hospital efficiency
(input) and hospital service quality (output)
• Use efficiency score as a key indicator to evaluation
31
DMU X1 (Labor) X2 (Medical Devices) Y1 (Operation) Y2 (Occupancy Rate)
Hospital A 50 60 40 30
Hospital B 75 95 55 65
Hospital C 100 120 150 130
Input Output
variable
Input/ Output variables are no limit. If there are more variables, the number of DMUs should be more
only then can the analysis result be valid. DMU ≥ Max{a*b ; 3*(a+b)} => a: input variables; b: output variables
DEA introduction
• Solve the equations (u1, u2, v1, v2)
• Hospital A: (u1*40+u2*30)/(v1*50+v2*60) ≤ 1
• Hospital B: (u1*55+u2*65)/(v1*75+v2*95) ≤ 1
• Hospital C: (u1*150+u2*130)/(v1*100+v2*120) ≤ 1
• u1, u2, v1, v2 >=0
• Calculating Hospital A efficiency score
• Put u1, u2, v1, v2 into (u1*40+u2*30)/(v1*50+v2*60)
• There will be at least one (or more than one) hospital
efficiency score= 1
32
DEA introduction
• DEA has two main oriented method
• Constant return to scale (CRS): When
adding one point of resource input will
bring one point of output, then the
relationship between input and output
variables is a fixed return.
• Variable return to scale (VRS): If one
point of investment produces more than
one point of results, or one point of
investment is less than one point of
results.
33
DMU Efficiency score
Hospital A 0.93
Hospital B 1.00
Hospital C 1.00
Hospital D 0.92
A
B
C
D (inefficient DMU)
CSR Frontier
DEA introduction
• DEA
• It only gives you relative efficiencies -
efficiencies relative to the data
considered. It does not, and cannot,
give you absolute efficiencies.
• DEA with supervise learning
• Without any hypothesis (ex. data from
normal distribution)
• Without any handling with outlier data
• Without any feature selection process
34
DEA introduction
• CCR model:
• The model was first introduced in 1978, developed a linear programming
• It is based on the assumption that constant return to scale exists at the
efficient frontiers.
35
DEA introduction
• BBC model:
• The model is based on the VRS technology
assumption and it measures the pure technical
efficiency.
36
https://www.researchgate.net/publication/333403650_Use_of_data_envelop
ment_analysis_to_benchmark_environmental_product_declarations-
a_suggested_framework
DEA introduction
• We will start DEA from this table. (We use CCR assumption)
37
Price or Cost
(Non Beneficial)
Input
Storage Space
(Beneficial)
Output
Camera Quality
(Beneficial)
Output
Looks
(Beneficial)
Output
Mobile1 DMU1 250/527.97=0.4735 16/50.6=0.3162 12/22.98=0.5222 4/8.79=0.4551
Mobile2 DMU2 225/527.97=0.4262 16/50.6=0.3162 8/22.98=0.3482 5/8.79=0.5689
Mobile3 DMU3 300/527.9=-0.5682 32/50.6=0.6325 16/22/98=0.6963 4.5/8.79=0.512
Mobile4 DMU4 275/527.97=0.5209 32/50.6=0.6325 8/22.98=0.3482 4/8.79=0.4551
527.97 50.6 22.98
8.79
෍
𝑗=0
𝑛
𝑋𝑡𝑗2
DEA introduction
• Formulating the problem
38
n: number of DMU (4)
m: number of input criteria (1)
S: number of output criteria (3)
Xik and yrk denote the values of ith input criterion and rth output criterion for
kth DMU
ur and vi are the non-negative variable weights to be determined by the
solution of the minimization proble
4
1 3
DEA introduction
39
• Formulating the problem
g1 = min(0.4735V1)
Subject to
-0.3162u1 – 0.5222u2 – 0.4551u3 + 0.4735V1 >= 0
-0.3162u1 – 0.3482u2 – 0.5689u3 + 0.4262V1 >= 0
-0.6325u1 – 0.6963u2 – 0.5120u3 + 0.5682V1 >= 0
-0.6325u1 – 0.3482u2 – 0.4551u3 + 0.5209V1 >= 0
0.3162u1 + 0.5222u2 + 0.4551u3 = 1
u1, u2, u3 >=0
We need to calculate g1, g2, g3, g4
DEA introduction
40
• Formulating the problem
g2 = min(0.4262V1)
Subject to
-0.3162u1 – 0.5222u2 – 0.4551u3 + 0.4735V1 >= 0
-0.3162u1 – 0.3482u2 – 0.5689u3 + 0.4262V1 >= 0
-0.6325u1 – 0.6963u2 – 0.5120u3 + 0.5682V1 >= 0
-0.6325u1 – 0.3482u2 – 0.4551u3 + 0.5209V1 >= 0
0.3162u1 + 0.3483u2 + 0.5689u3 = 1
u1, u2, u3 >=0
We need to calculate g1, g2, g3, g4
DEA introduction
41
• Formulating the problem
g3 = min(0.5682V1)
Subject to
-0.3162u1 – 0.5222u2 – 0.4551u3 + 0.4735V1 >= 0
-0.3162u1 – 0.3482u2 – 0.5689u3 + 0.4262V1 >= 0
-0.6325u1 – 0.6963u2 – 0.5120u3 + 0.5682V1 >= 0
-0.6325u1 – 0.3482u2 – 0.4551u3 + 0.5209V1 >= 0
0.6325u1 + 0.6963u2 + 0.512u3 = 1
u1, u2, u3 >=0
We need to calculate g1, g2, g3, g4
DEA introduction
42
• Formulating the problem
g4 = min(0.5209V1)
Subject to
-0.3162u1 – 0.5222u2 – 0.4551u3 + 0.4735V1 >= 0
-0.3162u1 – 0.3482u2 – 0.5689u3 + 0.4262V1 >= 0
-0.6325u1 – 0.6963u2 – 0.5120u3 + 0.5682V1 >= 0
-0.6325u1 – 0.3482u2 – 0.4551u3 + 0.5209V1 >= 0
0.6325u1 + 0.3482u2 + 0.4551u3 = 1
u1, u2, u3 >=0
We need to calculate g1, g2, g3, g4
DEA introduction
• Practice DEA
• Go to the pyDEA folder
• Type python -m
pyDEA.main_gui to RUN
43
DEA introduction
44
Using dea_example.xlsx
DEA introduction
DMU Efficiency Score
Mobile1 DMU1 0.9682
Mobile2 DMU2 1
Mobile3 DMU3 1
Mobile4 DMU4 1
45
DEA introduction
• How to improve the performance of
inefficiency DMU? (CCR model)
• Input oriented: Focuses on minimizing
the level of inputs or efficiency with an
assumption of fixed level of outputs.
• Output oriented: Focuses on efficient
outputs with an assumption of fixed level
of inputs.
46
Projection to frontier
DEA introduction
• DEA and regression
• On regression line DMUs have
average efficiency.
• Regression can
accommodate Multiple
inputs or outputs but not
both.
• Regression provides only
average relationships not
best practice.
47
DEA introduction
48
DEA introduction
49
Homework
• Analyze a simple DEA problem, here is our data set
• Graphic analysis
• What are the efficiencies of DMUs?
50
Homework
• Hint
51
Reference
• http://d1c25a6gwz7q5e.cloudfront.net/papers/403.pdf
• https://github.com/jzuccollo/pyDEA
52

More Related Content

PDF
overview of_data_processing
 
PDF
4 visualization inter
 
PDF
GLM & GBM in H2O
PDF
A Firefly based improved clustering algorithm
PPTX
Database Performance Analysis with Time Series
PPTX
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
PDF
Introduction to Machine Learning with SciKit-Learn
PDF
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
overview of_data_processing
 
4 visualization inter
 
GLM & GBM in H2O
A Firefly based improved clustering algorithm
Database Performance Analysis with Time Series
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
Introduction to Machine Learning with SciKit-Learn
A Semantic Web Platform for Automating the Interpretation of Finite Element ...

What's hot (14)

PPTX
Clustering: A Scikit Learn Tutorial
PDF
Azure Machine Learning and ML on Premises
PDF
Experimental Design for Distributed Machine Learning with Myles Baker
PDF
How Semantic Technologies can help to cure Hearing Loss?
PDF
Introduction to Data Mining - A Beginner's Guide
PPTX
Linear regression on 1 terabytes of data? Some crazy observations and actions
PDF
Entity embeddings for categorical data
PDF
Region-Based Search in Large Medical Image Repositories
PPTX
A Semantic Web Platform for Improving the Automation and Reproducibility of F...
PDF
The Art of Performance Evaluation
PDF
Artificial intelligence and data stream mining
PPTX
MS SQL SERVER: Olap cubes and data mining
PDF
Building Data Products
PDF
Evolutionary Design of Swarms (SSCI 2014)
Clustering: A Scikit Learn Tutorial
Azure Machine Learning and ML on Premises
Experimental Design for Distributed Machine Learning with Myles Baker
How Semantic Technologies can help to cure Hearing Loss?
Introduction to Data Mining - A Beginner's Guide
Linear regression on 1 terabytes of data? Some crazy observations and actions
Entity embeddings for categorical data
Region-Based Search in Large Medical Image Repositories
A Semantic Web Platform for Improving the Automation and Reproducibility of F...
The Art of Performance Evaluation
Artificial intelligence and data stream mining
MS SQL SERVER: Olap cubes and data mining
Building Data Products
Evolutionary Design of Swarms (SSCI 2014)
Ad

Similar to 6 data envelopment_analysis (20)

PDF
Linear Models for Engineering applications
PPT
157_37315_EA221_2013_1__2_1_Chapter 1, introduction to OR (1).ppt
PPT
or row.ppt .
PPTX
Data Envelopment Analysis
PPTX
Operations research.pptx
PDF
1. intro. to or &amp; lp
PPTX
CH-2 Linear Programing.pptx
PDF
Operation research history and overview application limitation
PDF
1 Operations Research notes for beginners (OR)-I.pdf
PPT
OR_Hamdy_taha.ppt
PPT
OR_Hamdy_taha.ppt
PPTX
Synthesis of analytical methods data driven decision-making
PPTX
Introduction to operations research
PDF
G0211056062
PPTX
Linear Programming
PPTX
Operation Research engineering-WPS Office.pptx
PPTX
Resource management techniques
PPTX
Operations Research ch1 (Introduction).pptx
Linear Models for Engineering applications
157_37315_EA221_2013_1__2_1_Chapter 1, introduction to OR (1).ppt
or row.ppt .
Data Envelopment Analysis
Operations research.pptx
1. intro. to or &amp; lp
CH-2 Linear Programing.pptx
Operation research history and overview application limitation
1 Operations Research notes for beginners (OR)-I.pdf
OR_Hamdy_taha.ppt
OR_Hamdy_taha.ppt
Synthesis of analytical methods data driven decision-making
Introduction to operations research
G0211056062
Linear Programming
Operation Research engineering-WPS Office.pptx
Resource management techniques
Operations Research ch1 (Introduction).pptx
Ad

More from FEG (20)

PDF
Supervised learning in decision tree algorithm
 
PDF
Unsupervised learning in data clustering
 
PDF
CNN_Image Classification for deep learning.pdf
 
PDF
Sequence Model with practicing hands on coding.pdf
 
PDF
Seq2seq Model introduction with practicing hands on coding.pdf
 
PDF
AIGEN introduction with practicing hands on coding.pdf
 
PDF
資料視覺化_Exploation_Data_Analysis_20241015.pdf
 
PDF
Operation_research_Linear_programming_20241015.pdf
 
PDF
Operation_research_Linear_programming_20241112.pdf
 
PDF
非監督是學習_Kmeans_process_visualization20241110.pdf
 
PDF
Sequence Model pytorch at colab with gpu.pdf
 
PDF
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
 
PDF
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
 
PDF
Pytorch cnn netowork introduction 20240318
 
PDF
2023 Decision Tree analysis in business practices
 
PDF
2023 Clustering analysis using Python from scratch
 
PDF
2023 Data visualization using Python from scratch
 
PDF
2023 Supervised Learning for Orange3 from scratch
 
PDF
2023 Supervised_Learning_Association_Rules
 
PDF
202312 Exploration Data Analysis Visualization (English version)
 
Supervised learning in decision tree algorithm
 
Unsupervised learning in data clustering
 
CNN_Image Classification for deep learning.pdf
 
Sequence Model with practicing hands on coding.pdf
 
Seq2seq Model introduction with practicing hands on coding.pdf
 
AIGEN introduction with practicing hands on coding.pdf
 
資料視覺化_Exploation_Data_Analysis_20241015.pdf
 
Operation_research_Linear_programming_20241015.pdf
 
Operation_research_Linear_programming_20241112.pdf
 
非監督是學習_Kmeans_process_visualization20241110.pdf
 
Sequence Model pytorch at colab with gpu.pdf
 
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
 
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
 
Pytorch cnn netowork introduction 20240318
 
2023 Decision Tree analysis in business practices
 
2023 Clustering analysis using Python from scratch
 
2023 Data visualization using Python from scratch
 
2023 Supervised Learning for Orange3 from scratch
 
2023 Supervised_Learning_Association_Rules
 
202312 Exploration Data Analysis Visualization (English version)
 

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Spectroscopy.pptx food analysis technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPT
Teaching material agriculture food technology
PDF
Approach and Philosophy of On baking technology
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Review of recent advances in non-invasive hemoglobin estimation
Understanding_Digital_Forensics_Presentation.pptx
Electronic commerce courselecture one. Pdf
Encapsulation_ Review paper, used for researhc scholars
MIND Revenue Release Quarter 2 2025 Press Release
Spectroscopy.pptx food analysis technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Mobile App Security Testing_ A Comprehensive Guide.pdf
Encapsulation theory and applications.pdf
MYSQL Presentation for SQL database connectivity
Teaching material agriculture food technology
Approach and Philosophy of On baking technology
Programs and apps: productivity, graphics, security and other tools
Spectral efficient network and resource selection model in 5G networks
Advanced methodologies resolving dimensionality complications for autism neur...
Per capita expenditure prediction using model stacking based on satellite ima...
Digital-Transformation-Roadmap-for-Companies.pptx
Machine learning based COVID-19 study performance prediction
Review of recent advances in non-invasive hemoglobin estimation

6 data envelopment_analysis

  • 1. 國立臺北護理健康大學 NTUHS Data Envelopment Analysis (DEA) Orozco Hsu 2021-06-10 1
  • 2. About me • Education • NCU (MIS)、NCCU (CS) • Work Experience • Telecom big data Innovation • AI projects • Retail marketing technology • User Group • TW Spark User Group • TW Hadoop User Group • Taiwan Data Engineer Association Director • Research • Big Data/ ML/ AIOT/ AI Columnist 2 「How can you not get romantic about baseball ? 」
  • 3. Tutorial Content 3 Linear Programming (Maximizing Profit) DEA introduction Homework Operation Research
  • 4. Code • Download code • https://github.com/orozcohsu/ntunhs_2020/tree/master/alg_20210610 4
  • 5. Quiz 1 • 1. Operations research is the application of ________methods to arrive at the optimal Solutions to the problems. • A. economical • B. scientific • C. artistic • D. a and b both 5
  • 6. Quiz 2 • 2. Operations research is based upon collected information, knowledge and advanced study of various factors impacting a particular operation. This leads to more informed ________. • A. Management processes • B. Decision making • C. Procedures • D. Machine learning 6
  • 7. Quiz 3 • 3. What is the objective function in linear programming problems? • A. A constraint for available resource • B. An objective for research and development of a company • C. A linear function in an optimization problem • D. A set of non-negativity conditions 7
  • 8. Quiz 4 • About the Data Envelopment Analysis which is TRUE? • A. We can also use regression line to analysis inefficiency DMUs • B. In CCR model of input oriented, that means we want to maximum the outputs • C. Compare with CRS and VRS, CSR has more benchmarks (efficiency score=1) • D. Data envelopment analysis may generate inaccurate efficiency scores, especially when the sample size is small relative to the number of inputs and outputs 8
  • 10. Operation Research • The field of operations research provides QUANTITATIVE methods to solve problems including MAXIMIZE profits or MINIMIAZE losses, investigating the outcomes under fluctuating market conditions, and to facilitate decision making. • A new field which started in the late 1930’s and has grown and expanded tremendously in the last 30 years. 10
  • 12. Operation Research • Operations Research in one word • OPTIMIAZTION • Operations Research in one sentence • DO THE THINGS BEST UNDER CONSTRANITS. 12 https://en.wikipedia.org/wiki/Convex_optimization Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard
  • 13. Operation Research • Three main problem classes in OR • Mathematical programming • Linear Programming with objective function and constraints. • Numerical optimization • Convex Optimization: Gradient-based • Non-gradient: Bayesian, Genetic Algorithms…etc. • Simulation • Used to repeated random sampling and obtains numerical result to approximate a probability distribution , like MCMC (Model Carlo Markov Chain) 13 https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo Newton method Tangent line
  • 14. Operation Research • Linear Programming • Translate into a model. • A simple technique where we depict complex relationships through linear functions and then find the optimum points. 14
  • 15. Operation Research • Nonlinear Programming • The process of solving an optimization problem where some of the constraints or the objective function are nonlinear. • Objective function • CONCAVE (maximization problem) • CONVEX (minimization problem) 15 Maximize: f(x) = x1 + x2 Subject to: x1 ≥ 0 x2 ≥ 0 x1 2 + x2 2 ≥ 1 x1 2 + x2 2 ≤ 2
  • 16. Operation Research • The gradient decent optimization • Gradient descent is A FIRST-ORDER iterative optimization algorithm for finding a LOCAL minimum of a differentiable function. • The idea is to take repeated steps in the OPPOSITE DIRECTION of the gradient of the function at the current point, because this is the direction of steepest descent. 16 For bigdata consideration, we used to use mini-batch for gradient decent (SGD). https://towardsdatascience.com/batch-mini-batch-stochastic-gradient-descent-7a62ecba642a
  • 17. Operation Research 17 • Linear Programming • Object function is linear • Non-Linear Programming • Object function is non-linear (convex) • Convex • Global optimization • Non-Convex • Local optimization • Time complexity • Polynomial time • NP complete
  • 18. Operation Research • Monte Carlo: Simulation to draw quantities of interest from the distribution. • Markov Chain: Stochastic process in which future states are independent of past state given the present state. • MCMC: A class of method in which we can simulate draw that are slightly dependent and are approximately from posterior distribution. 18 https://www.youtube.com/watch?v=R9NQY2Hyl14
  • 20. Linear Programming (Maximizing Profit) • Linear Programming (LP) and the Simple algorithm has been around for decades. • It was first introduced in the U.S. Air Force for helping with strategical planning back in the 40s. • Ever since then, many industries are taking advantage of it to maximize profit and minimize cost, among other things. • Often used in process of data-driven decision making. 20 https://arxiv.org/ftp/cs/papers/0611/0611008.pdf *Why LP cannot solve large instances of NP-complete problems in polynomial time? (convex optimization)
  • 21. Linear Programming (Maximizing Profit) • Objective function • Maximize or Minimize • Constrains • linear inequalities Maximize: 3x + 2y Subject to: y < 20 – 6x y < 12 – 2x y < 10 -x 21
  • 22. Linear Programming (Maximizing Profit) • 2D Process: • (K: Capital, L: Labor) • Constrains: • GH • Best solution: • At point E (OJN) • Feasible region reaches the isoquant for 200Q (the highest possible) • Take action: • 200 units of output with process 2 by using 8L and 8K K/L =2 K/L =1 K/L =0.5 200Q = D(6L, 12K), E(8L, 8K), F(12L, 6K) 22
  • 23. Linear Programming (Maximizing Profit) • Mix of constrains • Find out the maximize point 2D line intersects graphic_solution.ipynb 23
  • 24. Linear Programming (Maximizing Profit) • This demonstrates why we don't try to GRAPH the FEASIBLE REGION when there are more than two decision variables. • Three dimensional graphs aren't that easy to draw and you can forget about making the sketch when there are four or more decision variables. 24
  • 25. Linear Programming (Maximizing Profit) • Problem we need to solve • The company produces four furniture items: chairs, tables, desks, and bookcases. • By improving the operations of the firm and its resources allocation, we can potentially maximize the profit 5 board 10 man-hours 3 ounces 4 square fee of leather 25
  • 26. Linear Programming (Maximizing Profit) • The bottom neck is that all these material have the following total quantities available, every week • Our task is to decide how to better allocate these resources together in order to make the most profit. Material item quantities Board 20,000 Man-hours 4,000 Ounces of glue 2,000 Square feet of leather 3,000 Square feet of glass 500 Product Profit($) Chair 45 Table 80 Desk 110 Bookcase 55 26
  • 27. Linear Programming (Maximizing Profit) • This is the Maximization LP (Linear Programming) problem • c for chair, t for table, d for desk, and b for bookcase constraints resources available 27
  • 28. Linear Programming (Maximizing Profit) • Optimal profit •$20,857 • Chair •57 • Table •228 28 Activity-Analysis-Problem.ipynb
  • 30. DEA introduction • Question: • How to evaluate the efficiency of those 3 hospitals? • Answer: • You may refer to many indicators such like Occupancy Rate, Average Length of Stay, Days of Stay…etc. • Each hospital has different indicators, but how to evaluate? • Use scatter plot that can help, but only works with maximum 3 indicators in visualization. 30 Decision Making Unit (DMU): It represents the basic unit of analysis. In this case, 10 hospitals equals to 10 DMU.
  • 31. DEA introduction • Data Envelopment Analysis (DEA) • Consider multiple efficiency indicators, that can analyze hospital efficiency (input) and hospital service quality (output) • Use efficiency score as a key indicator to evaluation 31 DMU X1 (Labor) X2 (Medical Devices) Y1 (Operation) Y2 (Occupancy Rate) Hospital A 50 60 40 30 Hospital B 75 95 55 65 Hospital C 100 120 150 130 Input Output variable Input/ Output variables are no limit. If there are more variables, the number of DMUs should be more only then can the analysis result be valid. DMU ≥ Max{a*b ; 3*(a+b)} => a: input variables; b: output variables
  • 32. DEA introduction • Solve the equations (u1, u2, v1, v2) • Hospital A: (u1*40+u2*30)/(v1*50+v2*60) ≤ 1 • Hospital B: (u1*55+u2*65)/(v1*75+v2*95) ≤ 1 • Hospital C: (u1*150+u2*130)/(v1*100+v2*120) ≤ 1 • u1, u2, v1, v2 >=0 • Calculating Hospital A efficiency score • Put u1, u2, v1, v2 into (u1*40+u2*30)/(v1*50+v2*60) • There will be at least one (or more than one) hospital efficiency score= 1 32
  • 33. DEA introduction • DEA has two main oriented method • Constant return to scale (CRS): When adding one point of resource input will bring one point of output, then the relationship between input and output variables is a fixed return. • Variable return to scale (VRS): If one point of investment produces more than one point of results, or one point of investment is less than one point of results. 33 DMU Efficiency score Hospital A 0.93 Hospital B 1.00 Hospital C 1.00 Hospital D 0.92 A B C D (inefficient DMU) CSR Frontier
  • 34. DEA introduction • DEA • It only gives you relative efficiencies - efficiencies relative to the data considered. It does not, and cannot, give you absolute efficiencies. • DEA with supervise learning • Without any hypothesis (ex. data from normal distribution) • Without any handling with outlier data • Without any feature selection process 34
  • 35. DEA introduction • CCR model: • The model was first introduced in 1978, developed a linear programming • It is based on the assumption that constant return to scale exists at the efficient frontiers. 35
  • 36. DEA introduction • BBC model: • The model is based on the VRS technology assumption and it measures the pure technical efficiency. 36 https://www.researchgate.net/publication/333403650_Use_of_data_envelop ment_analysis_to_benchmark_environmental_product_declarations- a_suggested_framework
  • 37. DEA introduction • We will start DEA from this table. (We use CCR assumption) 37 Price or Cost (Non Beneficial) Input Storage Space (Beneficial) Output Camera Quality (Beneficial) Output Looks (Beneficial) Output Mobile1 DMU1 250/527.97=0.4735 16/50.6=0.3162 12/22.98=0.5222 4/8.79=0.4551 Mobile2 DMU2 225/527.97=0.4262 16/50.6=0.3162 8/22.98=0.3482 5/8.79=0.5689 Mobile3 DMU3 300/527.9=-0.5682 32/50.6=0.6325 16/22/98=0.6963 4.5/8.79=0.512 Mobile4 DMU4 275/527.97=0.5209 32/50.6=0.6325 8/22.98=0.3482 4/8.79=0.4551 527.97 50.6 22.98 8.79 ෍ 𝑗=0 𝑛 𝑋𝑡𝑗2
  • 38. DEA introduction • Formulating the problem 38 n: number of DMU (4) m: number of input criteria (1) S: number of output criteria (3) Xik and yrk denote the values of ith input criterion and rth output criterion for kth DMU ur and vi are the non-negative variable weights to be determined by the solution of the minimization proble 4 1 3
  • 39. DEA introduction 39 • Formulating the problem g1 = min(0.4735V1) Subject to -0.3162u1 – 0.5222u2 – 0.4551u3 + 0.4735V1 >= 0 -0.3162u1 – 0.3482u2 – 0.5689u3 + 0.4262V1 >= 0 -0.6325u1 – 0.6963u2 – 0.5120u3 + 0.5682V1 >= 0 -0.6325u1 – 0.3482u2 – 0.4551u3 + 0.5209V1 >= 0 0.3162u1 + 0.5222u2 + 0.4551u3 = 1 u1, u2, u3 >=0 We need to calculate g1, g2, g3, g4
  • 40. DEA introduction 40 • Formulating the problem g2 = min(0.4262V1) Subject to -0.3162u1 – 0.5222u2 – 0.4551u3 + 0.4735V1 >= 0 -0.3162u1 – 0.3482u2 – 0.5689u3 + 0.4262V1 >= 0 -0.6325u1 – 0.6963u2 – 0.5120u3 + 0.5682V1 >= 0 -0.6325u1 – 0.3482u2 – 0.4551u3 + 0.5209V1 >= 0 0.3162u1 + 0.3483u2 + 0.5689u3 = 1 u1, u2, u3 >=0 We need to calculate g1, g2, g3, g4
  • 41. DEA introduction 41 • Formulating the problem g3 = min(0.5682V1) Subject to -0.3162u1 – 0.5222u2 – 0.4551u3 + 0.4735V1 >= 0 -0.3162u1 – 0.3482u2 – 0.5689u3 + 0.4262V1 >= 0 -0.6325u1 – 0.6963u2 – 0.5120u3 + 0.5682V1 >= 0 -0.6325u1 – 0.3482u2 – 0.4551u3 + 0.5209V1 >= 0 0.6325u1 + 0.6963u2 + 0.512u3 = 1 u1, u2, u3 >=0 We need to calculate g1, g2, g3, g4
  • 42. DEA introduction 42 • Formulating the problem g4 = min(0.5209V1) Subject to -0.3162u1 – 0.5222u2 – 0.4551u3 + 0.4735V1 >= 0 -0.3162u1 – 0.3482u2 – 0.5689u3 + 0.4262V1 >= 0 -0.6325u1 – 0.6963u2 – 0.5120u3 + 0.5682V1 >= 0 -0.6325u1 – 0.3482u2 – 0.4551u3 + 0.5209V1 >= 0 0.6325u1 + 0.3482u2 + 0.4551u3 = 1 u1, u2, u3 >=0 We need to calculate g1, g2, g3, g4
  • 43. DEA introduction • Practice DEA • Go to the pyDEA folder • Type python -m pyDEA.main_gui to RUN 43
  • 45. DEA introduction DMU Efficiency Score Mobile1 DMU1 0.9682 Mobile2 DMU2 1 Mobile3 DMU3 1 Mobile4 DMU4 1 45
  • 46. DEA introduction • How to improve the performance of inefficiency DMU? (CCR model) • Input oriented: Focuses on minimizing the level of inputs or efficiency with an assumption of fixed level of outputs. • Output oriented: Focuses on efficient outputs with an assumption of fixed level of inputs. 46 Projection to frontier
  • 47. DEA introduction • DEA and regression • On regression line DMUs have average efficiency. • Regression can accommodate Multiple inputs or outputs but not both. • Regression provides only average relationships not best practice. 47
  • 50. Homework • Analyze a simple DEA problem, here is our data set • Graphic analysis • What are the efficiencies of DMUs? 50