Project Analytics

Project Analytics
© Dan Trietsch
June 2019

Reference
Kenneth R. Baker and Dan Trietsch (2019),
Principles of Sequencing and Scheduling,
2nd edition, Wiley. (Ch. 18 and Appendix
A, with earlier work cited within.)

Summary
• PERT is useless because it is based on
an invalid assumption that the triplet
method works, AND that activity times
are statistically independent.
• A meritorious recommendation—to
calibrate—has been practiced in the
breach, AND would not be sufficient.

Summary
• We’ll validate that the lognormal
distribution works for projects.
• By cross-validation, we’ll show that
linear association—perhaps the simplest
model of statistical dependence—works
for projects, and makes possible
prediction for future projects.

Summary
• While doing that, we will show that
subjective estimates are inferior, and
that the independence assumption is
indefensible.
• And we will show that calibration helps
but not enough: it downplays the effects
of between-projects variation.

Summary
• Among many other future opportunities
now open, first and foremost is software
development.

Summary
• Mainly for theoreticians, we need
dedicated software for specialized
statistical tests. That will support further
research in Project Analytics.

Summary
• For practitioners, I believe we have
enough results already to support the
incorporation of software for Project
Analytics within project management
DSSs.

Summary
• Some of the practically-oriented software
can also support further research,
especially in partitioning.

Preliminary Notes
This talk is describes the state of the art of
Project Analytics. As such, it is also an
invitation to push said state of the art
forward.

Preliminary Notes
As the title suggests, we’ll focus on analytics.
But the context is Lognormal Scheduling for
PERT 21. Lognormal Scheduling is Safe
Scheduling under the assumption processing
times are lognormal. Safe Scheduling takes
safety time into account explicitly. (If I could,
I would have called it “Robust Scheduling,”
but that term had been hi-jacked by then.)

Preliminary Notes
Largely because it incorporates Lognormal
Scheduling, PERT 21 is a version of PERT
that does NOT rely on the beta distribution (as
well as other invalid assumptions), considers
restricted capacity (by sequencing), and
provides optimal safety time (by scheduling
release dates).

Preliminary Notes
Think about it:
NOBODY EVER VALIDATED PERT
And incidentally, nobody ever validated CC
(or the so called “Theory” of Constraints)
either.
If I have time, I’ll show that they are both
invalid.

Preliminary Notes
The next two slides are the title and the first
slide of a talk I presented at Tel Aviv
University and at the Technion (Haifa) in
2005. A revised version was presented at
Leuven in 2007 (with the same two slides
at its top).

Stochastic Economic Balance
Principles for Projects Scheduling
Feeding buffers, Crashing and
Sequencing

Stochastic Economic Balance
• Any practicable and robust project scheduling
and control framework should:
• (1) be based on sound theory;
• (2) yield economical results;
• (3) be intuitively acceptable to decision
makers;
• (4) require modest information inputs from
users and provide easy-to-utilize outputs.

Preliminary Notes
In particular, a stochastic ‘engine’ must satisfy
(1) and (4): Be valid and easy to use. But an
engine can be ‘easy to use’ and yet powerful:
we can and should employ powerful models
within DSSs. (3)—intuitive acceptance—does
not require that the user be able to do the job
without (possibly complex) software.

Preliminary Notes
In particular, a stochastic ‘engine’ must satisfy
(1) and (4): Be valid and easy to use.
The simplest possible input requirement is a
point estimate for each activity (as in CC). We
use only that plus relevant history. (Such
history is always available, even if it is
apparently missing. Ask me later.)

Preliminary Notes
Simplicity aside, one can think about PERT
21 as a version of PERT/CPM that requires
validated theory. The validation is what
Project Analytics is all about. That is, PERT
21 would not exist without Project
Analytics.

Some general observations
To elaborate, the need to avoid GIGO
(garbage in, garbage out) applies to any
operations research model, not just to
simulation models.

By definition, using a model without valid
data is invalid. But for some reason this
message has typically been ignored or
practiced in the breach.

Where money is clearly involved, valid
data is more likely to be used, to wit,
Financial Engineering. But are we there
with respect to project scheduling and
budgeting?

Project Analytics is necessary because the
answer is, unfortunately, no! So the crux is
to validate assumptions and obtain valid
data for project scheduling and budgeting.
(At present, however, we have much more
evidence regarding scheduling.)

Validation is akin to empirical science: you
start with a hypothesis (theory) and apply
trial and error.
This talk covers the major developments so
far, including (but not limited to) work at
American University of Armenia and at
Ghent University.

Another point of view:
Models are never perfect:
(1) they do not fit reality perfectly
(including imprecision and the use of
substitute objectives), and
(2) they may not have tractable analytic
solutions.

When we write a mathematical paper we
can ignore the fit issue, and focus on
achieving the best possible analytic
solution. Regarding fit, all we have to do is
assume it is good.

However, scheduling is an engineering
subject. At least if should be treated as an
engineering subject. (And even if you don’t
agree, that’s our assumption today.)

Stereotypically, mathematicians require
analytic solutions but may be willing to
accept unrealistic models. Engineers accept
approximate solutions but require realistic
models. A good engineering model
achieves the right balance between fit and
solution quality.

Project Analytics involves approximations
but for more realistic models than those
published before.

Some history
As a practicing engineer, long ago, I had
access to project scheduling data but no
motivation to use it. Later, I became an
academic and I acquired the motivation, but
no longer had access! Field data is in short
supply for academics (unless their
consulting is relevant to their research).

Some history
Then I had the opportunity to supervise a
masters student—Lilit Gevorgyan—who
had access to two project managers. For her
thesis, she collected data on 5 + 9 projects,
and her task was to check whether the
lognormal distribution would provide a
good fit.

Some history
But Lilit decided to use a statistical
package to find the best fit. Four or five
distributions looked okay. The lognormal
was second or third in the list. Accordingly,
she suggested we consider the first one in
the list first.

Some history
Nonetheless, I stuck to the lognormal.
Why? Because it was the only one that
made sense theoretically (which is why it
was hypothesized to fit in the first place).

Some history
Letting a small sample steer you away from
a hypothesis even though it does not
contradict it is the wrong approach. Of
course one can check new hypotheses in
future. And if the hypothesis would be
rejected our next step should be to
construct a new or corrected hypothesis.

Some history
To elaborate, as taught by Deming, we
must start with a hypothesis. Otherwise,
random data will drive us all over the place.
Until a hypothesis is rejected, there is no
compelling need to look for a different one.

Some history
Regarding the lognormal hypothesis in
particular, a decade later, it has not yet been
rejected (although corrections in the model
did take place). Nobody claims that it is
necessarily the only valid distribution we
could use, just that it is valid. It also ticks
several theoretical boxes….

Theory/Hypothesis
The most important engineering deviations
from plan are not additive, but proportional.

Theory/Hypothesis
With that in mind, if a positive deviation
may be 100%, that does not mean that a
negative deviation is –100%.
Ergo, don’t say 0.0 to 2.0, say 0.5 to 2.0.

Theory/Hypothesis
For instance, suppose our estimate is 0.5 but
we err by ‘–100%,’ then the true value is
0.5 x 2.0 = 1.
Likewise, if the estimate is 2.0 but we err
by ‘+100%,’ in which case 2.0 x 0.5 = 1

Theory/Hypothesis
Suppose we have an expected error of 0.
Then our range is from 1 to 1 (that is, [1–0]
to [1+0]). By contrast, if our expected error
is ∞ then our range is from ‘1/∞’ = 0 to ∞;
so 0 corresponds to –∞, 1 corresponds to 0,
and ∞ corresponds to ∞.

Theory/Hypothesis
Translating {0, 1, ∞} to {–∞, 0, ∞} ‘begs’
for the logarithmic transformation!
(Of course, some of us are so ‘addicted’ to
the additive model that the fact that
logarithms of products are additive would
be sufficient to convince us that this
particular transformation is the one.)

Theory/Hypothesis
IF the post-transformation distribution is
normal, then the original distribution must
be lognormal. So that’s the hypothesis we
start with.

Theory/Hypothesis
Hypothesis: Activity Times and processing
times follow the lognormal distribution.

Theory/Hypothesis
Hypothesis: Activity Times and processing
times follow the lognormal distribution.
And because deviations are proportional,
the lognormal distribution would apply to
ratios.

Theory/Hypothesis
The lognormal is especially attractive for
four reasons at least: (1) it is strictly
positive (w.p.1), (2) its cv is not restricted,
(3) it can approximate sums of positive
random variables (using the Lognormal
Sum Approximation), and (4) it can
represent the relationship between capacity
and activity time (for stochastic crashing).

46
Lognormal Scheduling
0
0.4
0.8
1.2
1.6
2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x
f(x)
exp
sigma = 0.25
sigma = 0.5
sigma = 1
sigma = 2

Drilling times in minutes (Clyde Dam, NZ)
0
20
40
60
80
100
120
140
160
180
0 25 55 85 115 145 175 205 235 280

Some Skewness Implications
The last column indicates the probability of
falling below the mode. Compare it to the third
column. And note: SL > 0.5 does not guarantee
that the mean will be supported!

Theory/Hypothesis
BTW, the ubiquitous variance statistic is
implicitly much more suited to the additive
world than it is for the multiplicative world.

Theory/Hypothesis
For instance, an erroneous (but frequent)
assumption is that a deviation that exceeds
3σ is out of control (As it would be for the
normal r.v.). As a result, an “in control”
deviation of the core normal r.v. within the
lognormal can often be misinterpreted as an
“out of control” signal! To continue,

Theory/Hypothesis
Concerning ratios, if pj denotes processing
time of activity j and ej is the actual time,
we hypothesize that ln(pj/ej) is normal.

Theory/Hypothesis
But first it is highly recommended to make
sure that activities are of similar planned
duration, with a ratio of 4-5 at most. That
may necessitate a partition.
The following example is from a
construction project in Armenia, and the
first step is to partition activities by size:

ln(pj/ej) as a function of planned duration
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1 2 4 8 16 32 64

On Rounding
Three parts are required. As a rule, each is then
analyzed separately.
In particular, there are 61 activities (or ‘points’)
with planned durations between 4 and 61, but
only 23 are visible. That implies that there are
multiple points that ‘cover’ each other.
This could not happen without rounding. But the
question is what to do about it.

On Rounding
The next slide shows a Q-Q chart where such
duplicate points receive consecutive (but
different) Blom scores. (Blom’s Scores were
developed for the normal distribution. The kth
smallest sample value receives the score
zk = Φ−1[(k − 0.375) / (n + 0.25)],
where n is the sample size.)

Q-Q chart of 61 activities with planned durations
between 4 and 61
y = 0.3718x - 0.1619
R² = 0.976
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
-3 -2 -1 0 1 2 3

On Rounding
Let m and s be the mean and the standard
deviation of lnX, where X is a lognormal random
variable. In such a Q-Q chart, the intercept
provides an estimate of m and the slope estimates
s; in particular, above we have m = −0.1609 and
s = 0.3718. R = 0.9879 (the square root of
0.976). R can be compared to tabulated values to
test for normality. Indeed, normality cannot be
rejected here: the p-value is almost 0.25 (> 0.05).

On Rounding
The left hand side of the following slide is
identical to the previous slide, but the right hand
side shows a Q-Q chart where such duplicate
points receive an appropriate average score
instead; that is, we take the average of all
duplicated points as the score that applies to all
of them.

Q-Q charts before and after correcting for rounding
y = 0.3718x - 0.1619
R² = 0.976
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
-3 -2 -1 0 1 2 3
y = 0.3768x - 0.1619
R² = 0.9892
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
-3 -2 -1 0 1 2 3

On Rounding
In this case, the p-value increases to slightly over
0.5, but we didn’t need that to avoid erroneous
rejection. Often, however, this simple rounding
correction spells the difference between
accepting and rejecting the hypothesis that the
distribution is normal.

On Rounding
Royston (1993) proposed averaging the indices
to calculate a single Blom’s score for them with
the result replacing k. However, experimentation
revealed that the previous way yields slightly
higher R2 values. On the one hand, my sample
was small, so it may have been a fluke. On the
other hand, I checked because I suspected (i.e.,
hypothesized) that would be the case in advance.

The Parkinson Distribution
Parkinson’s Law states that work expands to fill
out the time allotted to it (q). Empirically,
however, it may just be that work that finishes
early is REPORTED on time. Therefore, the time
we OBSERVE, X is:
X = max{q, Y}, where q is the allotted time and
Y is the real underlying random variable.

We introduced this distribution in Baker and
Trietsch (2009), Principles of Sequencing and
Scheduling. But empirical evidence collected
later and reported in Trietsch et al. (2012),
suggests that sometimes some early points are
reported correctly, whereas others are reported
‘on time.’

If some early points are reported correctly,
whereas others are reported ‘on time,’ then the
general form of the Parkinson distribution
assumes early points are reported on-time with a
probability PP, and reported correctly otherwise.

If some early points are reported correctly,
whereas others are reported ‘on time,’ then the
general form of the Parkinson distribution
assumes early points are reported on-time with a
probability PP, and reported correctly otherwise.
How would that look on a Q-Q chart?

Eurasia Foundation data, w/ and w/o rounding
y = 0.5664x + 0.2985
R² = 0.9105
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-4 -2 0 2 4
y = 0.5961x + 0.2985
R² = 0.9581
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-4 -2 0 2 4

Now assume that 9 points are Parkinsonian
y = 0.7469x + 0.1474
R² = 0.9665
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-2 0 2 4
y = 0.7592x + 0.1436
R² = 0.9823
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-2 0 2 4

Compare the parameters, such as slope
(higher, why?), the horizontal range
(indicating Blom’s scores, an issue we’ll
return to), and R2 (higher, why?).

BTW, the late Eliyahu Goldratt, a highly
intuitive thinker, opined that what we ascribe
to the Parkinson effect is actually due to what
he called the “Student’s Syndrome.” But if
that were the case the tails would not likely
have been that straight (because, likely, we’d
have been dealing with a mixture of
distributions based on the allowed lead time).

Altogether, there are 15 points reported on-time,
and yet the figure assumes only 9 Parkinsonian
points. The issue is that when a point is reported
on-time, we can’t tell if it finished approximately
on-time and then rounded (so it should be treated
as ‘true’) or finished early and then reported on-
time. The figure shows the case where 6 points
are assumed to be approximately on-time. But
how do we decide that it should be 9 + 6?

But how do we decide that it should be 9 + 6?
First, notice that we cannot prove that any
number we might select is correct. Validation
only shows that the lognormal Parkinson
distribution can work with 9 Parkinsonian
points. It cannot prove that no other number
could work too and what is the correct number.

First, notice that we cannot prove that any
number we might select is correct.
Selecting the number of Parkinsonian points is
based on an obvious heuristic: minimize SEY (the
regression standard error).

Selecting the number of Parkinsonian points is
based on an obvious heuristic: minimize SEY (the
regression standard error).
Notice that this is not equivalent to maximizing
R2. (It does so approximately, however.)

Notice that this is not equivalent to maximizing
R2. (It does so approximately, however.)
For multiple regression, minimizing SEY
maximizes the adjusted-R2. For that reason,
initially I thought it was equivalent. However,
Jordy Batselier and Mario Vanhoucke
demonstrated to me I was wrong.)

When we use the SEY–minimization heuristic to
decide how many points should be considered
Parkinsonian, however, we also have to adjust
the slope of the early points accordingly. That is
why the horizontal scale was different on the left.
However, we will revisit this issue once again,
still later.

It is certainly possible that the reason the
lognormal has not been validated for projects
earlier is the combination of rounding and the
Parkinson effect: both mask the true distribution.

It is certainly possible that the reason the
lognormal has not been validated for projects
earlier is the combination of rounding and the
Parkinson effect: both mask the true distribution.
To wit, Trietsch et al. (2012) tested a handful of
project data collected from various sources and
all of them yielded acceptable Q-Q charts once
rounding and the Parkinson effect were
considered. (Without the SEY heuristic, BTW.)

Colin and Vanhoucke (2015) tested the results of
Trietsch et al. (2012) on a dataset collected by
Batselier and Vanhoucke (2015), and in some
instances lognormality was not supported.
Resolving those is our next topic.

Mixtures and Partitioning
We have already mentioned partitioning briefly
when we discussed the need to keep the ratio
between planned durations within a ratio range
of up to 4 or 5 (that is, the maximum divided by
the minimum should not exceed 4 (preferably) or
5 (if necessary).

Mixtures and Partition
It turned out that although Trietsch et al. (2012)
briefly mentioned the need for partition to meet
the ratio requirement between the maximal and
minimal planned duration, they (i.e., we) failed
to actually do it! As a result, Colin and
Vanhoucke (2015)—who followed the same
procedure—did not do it either.

Mixtures and Partition
Indeed, analysis of the offending instances in
Colin and Vanhoucke (2015) revealed that they
involved outliers or mixtures. Most of them were
resolved by the basic partition. Few required
removal of clear outliers (always a potential
issue—not everything under the sun is in
statistical control).

Sometimes, however, just ensuring a ratio of up
to 4 or 5 is not enough. For instance, in
construction, some activities are sensitive to the
weather and others are not. Such instances
should not be addressed as outliers. Instead, we
have to find a way to partition, either in advance
based on project information or during post-
mortem analysis (to learn lessons for the future).

Project C2014-03, ej = 1, with the on time activities
included and after correcting for the Parkinson effect

Project C2014-03, ej = 1, but partitioned instead of
assuming the Parkinson effect

The current state of the art for post-mortem
partition is essentially by the SEY heuristic. The
idea is that if we pool all points together, SEY
will be excessive. If so, it is highly likely that we
can find at least one point whose removal will
reduce SEY. Now just repeat until no such
reductions are possible. The remaining points
form one part. Next, repeat from the start for all
rejected points. (Usually, there is no need to
repeat more than once.)

Recall that we postponed detailed discussion of
the way the SEY heuristic helps decide how many
points should be considered Parkinsonian. For
this purpose we only consider candidate points
with a ratio of actual to planned of unity. But as
we consider whether to call one more point
Parkinsonian (i.e., ‘remove’ it) we must also
adjust the slope of the remaining early points.
That is achieved by adjusting their Blom scores.

As we consider whether to call one more point
Parkinsonian (i.e., ‘remove’ it) we must also
adjust the slope of the remaining early points.
That is achieved by adjusting their Blom scores.
Essentially, suppose there are nP Parkinsonian
points and nE points reported early. Then we
estimate PP = nP / (nP + nE).

Essentially, suppose there are nP Parkinsonian
points and nE points reported early. Then we
estimate PP = nP / (nP + nE).
Because PP is the fraction of early activities that
cannot be used in the Q-Q chart (their true
duration is unknown), we modify Blom’s scores
for the remaining early activities. We use zk =
Φ−1((k − 0.375)/(n(1 – pP) + 0.25)).

Because PP is the fraction of early activities that
cannot be used in the Q-Q chart (their true
duration is unknown), we modify Blom’s scores
for the remaining early activities. We use zk =
Φ−1((k − 0.375)/(n(1 – pP) + 0.25)).
The adjustment reflects the effective reduction in
sample size that applies to early activities. But
we only use it for the nE strictly early activities.

Again, we use adjusted scores of
zk = Φ−1((k − 0.375)/(n(1 – pP) + 0.25))
for the nE strictly early activities. We use
unadjusted scores,
zk = Φ−1((k − 0.375)/(n + 0.25))
for all on-time and ‘rounded-to-one’ activities.
That leaves out the nP Parkinsonian observations,
which are simply omitted from the chart.

Linear Association and Prediction
Whereas mixtures is an issue only recently
addressed and not yet even properly published,
linear association—which is essentially an
intentionally simple way to model statistical
dependence—has been at the core of the earlier
developments (started at American University of
Armenia).

On Subjective Estimates
The Hill et al. dataset shows linear association,
and demonstrates the above result as well.

Linear association—or some other validated
dependence model—is ESSENTIAL for
prediction. So in a sense it is the most important
part of Project Analytics.
It is essential because ignoring dependence leads
to a very high probability of missing deadlines.

As these projects demonstrate, each project has
its own mean deviation from plan (inc. budget)

Because it’s only 12 projects—a small number as
Q-Q charts go—in both of these normality
cannot be rejected. But more importantly, each
point here represents the sum total of a full
project, and if we had independence than such
sums would exhibit low variation. Here it
appears that the hypothesis all project have the
same mean is rejectable. That can be more
formally shown by ANOVA.

In other words, each project has a different bias.
For project k, denote that bias by bk. Let B denote
the random variable from which bk is sampled.

Suppose that a set of random variables is defined
by Yj = XjB > 0 for a set of independent
nonnegative Xj and an independent positive
common factor B. Then we say that the members
of the set {Yj} are linearly associated.

If we assume linear association, then to simulate
a new project we can generate a value for bk as
well as realizations xj for logarithmic ratios and
then obtain linearly associated realizations yj = xj
bk. If we repeat this in r rows (where r should be
a large number, e.g., 1000), then the variation of
B will be taken into account, as well as the
variation of Xj. Let’s look at the result with or
without linear association.

In this case we used nonparametric bootstrap
instead. It yields similar results, but is simpler.

If, instead, we only calibrate the PERT estimates,
it’s better, but not good enough.

Project Analytics

More Related Content

Similar to Project Analytics (20)

Recently uploaded (20)

Project Analytics

Editor's Notes