Temporal Case Management 1998

Case Management: Modeling Time Varying Relationships

Background

The Case Management System needs to be able to capture, store, and report upon the
relationships among a number of business entities of interest to the Company in its efforts
to track and respond to investigation related activities. These entities include questionable
Transactions, Accounts, Cases, Various Persons and Organizations, and Investigation
Management Issues.

It is not enough to know what the relationships among these entities are at one moment in
time, say the current moment. Investigators need to be able to look for patterns of
relationships through time. Criminals are not so accommodating as to make their
intentions clear. In many significant ways, a Case is the recording of a pattern of
changing relationships through a period of time.

Traditional data modeling represents the facts about a company as they exist at a point
in time. The databases designed from such models have the same characteristic.
Representing time varying relationships among facts requires the use of what are called
temporal or historical database techniques. This approach is very powerful in terms of
what can be recorded and reported upon; it is, however, considerably more complex than
normal business system database designs.

Purpose

The purpose of this paper is to document the specific design approach taken to modeling
time in a Case Management Application. This documentation will be of use to the user in
understanding “what is going on” in the application, as well as giving them a sense of
what kinds of time varying information can and cannot be stored in the Application. It
will also be useful to future technical personnel maintaining the application.

Technical Principles

The data design principles used in the Case Management System‟s database design are
based upon Peter Chen‟s Entity-Relational Model1, Codd‟s Extended Relational Model2,
Snodgrass‟ three levels of representing time in a relational database 3.

Considering Chen‟s work, data about the enterprise can be described as Entity-sets,
Relationship-sets, and Value-sets. Codd mapped these set theory ideas onto the
mathematics of relations; this relational interpretation of the Chen Entity-Relationship
model forms the basis of our database design in the Case Management System.

The Entity-sets and Relationship-sets are represented as tables. Each member of an
Entity-set or Relationship-set is a row in its table. The Value-set values that are
functionally mapped from the Entity-sets and Relationship sets in the Entity-Relational

By David Tryon Page 1 November 1998


Model become a value on a row in a table column. So far, this is standard data modeling
/ database design.

In the Case Management System, “Main Function” tables and “Lookup” tables
implement the Entity-sets; the “Assignment” tables implement the Relationship-sets.

Several of the Relationship-sets do not have explicit “Assignment” tables, but have their
relationships represented by an embedded “foreign key” in the core tables, specifically
the Action and Transaction entity‟s relationships to Account and Case. The reason for
this is that Action and Transaction represent events - point in time occurrences - whereas
Case, Account, Known Party, and Case Management Issues represent business objects or
“things” that have duration - exist for a span of time. For a more complete explanation,
read the endnote.4

Temporal / Historical Database Design

The Entity-Relationship Model represents a “snapshot” in time; the current view of the
known facts, the attributes of the Entity-sets and Relationship-sets. This “current
snapshot in time” perspective is also true of the databases developed from models. Even
“historical” facts captured in these databases represent the “current” time‟s point of view
about what was true at that point in time in the historical past being described. It
represents “what we know now about what was true then.”

A consequence of this “current” perspective is that historical facts are represented as
separate attributes from current facts and are implemented as two separate fields. If we
need to save all past values, the set of past values becomes its own table with each past
value represented as a row in the “historical” table. This approach is so standard that it is
not even thought of as odd. It would mean that if we wanted complete history on every
attribute, every attribute would become its own table.

For several decades, database theorists have been trying to deal with the whole issue of
time in a more elegant and less developer intensive way. There are three possible types
of time varying databases: Transaction, Historical, and Temporal.

Transaction Databases. In Transaction Time, the times when changes are made to values
stored in the database are recorded; regular database technology does this with logging
and recovery. These databases can “rollback” updates to an earlier point in time through a
“database recovery.” This form of time representation is seldom used in an Application.

Historical Databases. Historical databases explicitly record when each fact in the
database changes out in the “real world.” This means that when a change is recorded to a
value stored in the database, the time when the change actually occurred in the business
must also be recorded. This form of Historical Database is just beginning to become
available commercially in some high-end databases. An Historical Database has an
interesting effect on an application written against it. The idea of “current” goes away;
“current” becomes a matter of selection criteria. Retrieval with today‟s date is “current.”



Retrieval with last month‟s date is “historical.” Retrieval with next month‟s date is
“pending.”

Temporal Databases. Temporal databases explicitly records not only when each fact
changes in the “real world,” but also records when each fact was recorded in the
database. This means that not only can we retrieve the true “change” history; we can
retrieve a complete history of all “corrections” and “cancellations” ever made. This
database form combines both the Transaction and the Historical time capabilities. It
requires two time points for retrieval: the “As Of” time of the data and the “As Viewed
From” time for the report. Full temporal databases are an active research area, but are not
yet available as commercial products.

The Case Management System Application is implementing a limited version of a
Historical Database. We are capturing time history on the tables representing
Relationship-sets in the data model. These are the Assignment tables in the Application.
To accomplish this, we include a Begin Date and an End Date field in each Assignment
table row. The idea is when a relationship is established between two entities (what we
generally refer to an “an assignment”), the Begin Date is set to the effective date of the
“assignment.” Later, when the relationship is ended, we set the End Date to the
termination date. We do not delete the row from the Assignment table.

Retrieval of data for reports must always be sure to limit the selected Assignment rows to
those that were effective within the desired time-period.

Some thoughts about “Time”: Temporal Granularity, Points-in-time, and Time-
Period (Duration)

About here, readers must be asking themselves whether all this is necessary; after all,
we‟re just talking about a little desktop application, right? The truth is that we all use
these ideas on a day to day basis. We insist on clarity when discussing time-periods -
when filling in a time sheet for an hourly contractor, is the paid period through or until
the end time point? An hour‟s pay rides on the distinction. What is the unit of time to be
used - the temporal granularity? Is the work shift from 8 A.M. until 4 P.M. (through?)? Is
it from 8:00 A.M. until 4:00 P.M. (3:59 P.M.)? Do we pay for 8 hours or 8 hours and one
minute?

People are great at contextual reasoning! In case of doubt, a person would just say, “You
know, 8 hours pay” and the fuzziness would be resolved. Computers absolutely fail with
fuzzy situations, as do payroll clerks. So, we must be clear or the reports retrieved from
different points of view may not “fit together” too well. Anyone who has ever tried to
balance the dollars between a weekly payroll system and a monthly general ledger system
understands the consequences of incommensurate, temporal granularity.

In the Case Management System Application, time is recorded to the limits of the
Microsoft Date data type - some millionths of a second. Time reporting requests will be
(unless otherwise specified - such as when an Action occurred) in days.



Some reports are logically point-in-time oriented. An example would be a report to see
all “Currently Open Cases.” Others are inherently for a time-period, such as a “Cases
Opened During the Month.” In order to simplify and make the query process more
uniform, all queries will be stated as a duration with a Begin Date and an End Date. This
approach will work because we will assume all queries to be inclusive of the Begin Date
and the End Date. Therefore, a single point-in-time, a day, is represented when the Begin
Date and the End Date are equal.

An important comment about unspecified End Dates. When an assignment begins, the
End Date is usually not yet known. Therefore, rather than “kludging” the End Date field
with some arbitrarily high date value (for example, January 1, 2500) so that date
comparison logic will “work,” it has been decided to be correct and leave currently
“open” assignments as “Null.” There is no value assigned to the End Date field. This
means that retrieval logic must assume that a “Null” End Date is greater than any
specified Reporting End Date for comparison purposes.

Some practical consequences of the difference in the stored temporal granularity
(microsecond) and the retrieval temporal granularity (day).

The Report Begin Date must be converted to be midnight of the day the report is to occur
(e.g., 2/23/97 becomes 2/23/97 @ 00:00:00). The Report End Date must be converted to
one second before midnight at the end of the day (e.g., 2/27/97 becomes 2/27/97 @
23:59:59). This convention also supports the “point-in-time” convention of making the
Begin and End Dates the same to reference a single day. If the report were trying to
capture all the Investigative Actions taken on a given day, the correct (and complete) set
of Actions would be retrieved. “Now” type reports will use the Report End Date as the
“Now” so that the reports “as of a day” will be inclusive of anything the happen during
that day.

What Assignments Are “Valid” During a Time-Period?

If there are a number of assignment records being tested to see if they are to be selected
as part of the reporting time-period - a duration - what constitutes “being valid” during
the time period? For example, let us say we are selecting for the Month of February.
Does the assignment have to be valid for the entire month to be accepted? What is the
assignment becomes valid half way through the month, is it accepted? What if it is valid
at the beginning of the month and expires half way through. Is it accepted? What if it
becomes valid on the fifth and expires on the 21st?

In the Case Management System Application the standard assumption will be that any
transaction that was valid for any portion of a time period will be considered as valid
and selected. If we only want assignment valid on the last day of the month, select for
only the last day, not the entire month. If we only want transaction valid on the first and



still valid on the last day of the month, the selection must be explicitly defined in that
way through a custom query.

Standard Begin / End Date Query Logic

The issue will always come down to the precise structure of the Assignment Begin Date
and Assignment End Date comparisons with the Report Begin Date and Report End Date.

Reporting Reporting
Begin Date End Date
Report Period
Data Validity Period

B1 E1
1. Included

B2 E2 or NULL
2. Includes

B3 E3
3. Ends During

B4 E4 or NULL
4. Begins During

B5 E5
5. Ends Before

B6 E6 or NULL
6. Begins After

Given our definition of “included in the time period” above, there are several cases to be
considered. These alternatives are described in the table below.

As the reader can see, the inclusion rules get quite complex and compounded. However,
the exclusion rules are much simpler. They reduce to just two cases:

#5 - Case A: If the Data Begin Date is greater than the Report End Date.
#6 - Case B: If the Data End Date is less than the Report Begin Date.

If either of these conditions is true, the data is NOT selected. In all other cases, select the
data.

Logically, this situation nicely takes care of active, open Assignment records having a
Null Data End Date. Null in the comparison would force a false in Case B that is the
correct result because logically the Null represents a high-value date. Unfortunately,
there is a data type issue. It is necessary to separately test for the combination with the
Data End Date equal Null. This is annoying, but not an insurmountable obstacle.



Implementing the Date Comparison Logic in SQL

In order to implement this in the Case Management System Application‟ database, the
prepositional logic above must be translated into the appropriate relational algebra
expressions, and from there into an SQL “WHERE” clause. As we are implementing this
application in Access, we will target this conversion in terms of the syntax of the Access
Query Design Grid.

The Access Query Design Grid expresses the components of the generated SQL
“WHERE” clauses as successive rows in the "Criteria" section. Each row in the criteria
section represents a set of conditions that are logically “AND‟ed” together. For a data
record to be selected it must pass all of the criteria conditions on a row. Multiple criteria
rows are inclusively “OR‟ed” together. This means that if a data record passes any row, it
is selected. It does not need to pass more than one row to be selected.

This syntax means that we must manipulate our acceptance criteria into the form of one
or more sets of “AND‟ed” conditions. That is the goal.

So, where do we start? We know from the previous section that we have two logical
tests, Case A and Case B. Furthermore we know that if a data record passes either test, it
is excluded. What we need is transformation to restructure the test in the form of
“(condition-1) AND (condition-2) AND (condition-3) AND ...” to fit in with the syntax
of the Access Query Design Grid.

Let us start by translating our exclusion criteria into inclusion criteria. Let us simplify
our notation to be that Case A is known simply as A and Case B is known simply as B.
The exclusion rule is:

If A OR B = TRUE then exclude the record.

The opposite of this rule, the inclusion rule, is obtained by logically negating this
expression. In other words the inclusion rule is:

If NOT(A OR B) = TRUE then include the record.

We now have an inclusion rule, but it is not in the “right” form for the Access Query
Design Grid. Negated compounds do not fit in the grid - or, for that matter, in a
“WHERE” clause. Fortunately there is a logical equivalence theorem - known to
programming types as the “propagation of „nots‟” - that states the following equivalence:

NOT (A OR B) = (NOT A) AND (NOT B)

We replace this equivalent expression in our “inclusion” rule and it now reads:

If (NOT A) AND (NOT B) = TRUE then include the record.



Note that this is now in the “right” syntax to be expressed in the Access Query Design
Grid. Let us recall where we are. Our compound rule fully expanded is:

If (NOT Case A) AND (NOT Case B) = TRUE then include the record.
Case A = Data Begin Date is greater than the Report End Date
Case B = Data End Date is less than the Report Begin Date

Inequality logic comparisons are a little tricky. We must remember that:

If X less than Y = TRUE then NOT (X greater than or equal Y) = TRUE

Therefore, Case A and Case B into this rule:

NOT Case A = Data Begin Date is less than or equal the Report End Date
NOT Case B = Data End Date is greater than or equal the Report Begin Date

Now, substituting these definition of NOT Case A and NOT Case B into the expanded
“inclusion” rule above we get:

If (Data Begin Date is less than or equal the Report End Date)
AND
(Data End Date is greater than or equal the Report Begin Date)
then include the record.

Now, the Access Query Design Grid has columns of each data element in the record. In
particular, it has columns for Data Begin Date and Data End Date. Therefore, the entry
in the Access Query Design Grid looks like the following:

Data Begin Date Data End Date

criteria <= [Report End Date] >=[Report Begin Date]

Because of the somewhat inelegant way that Access handles Nulls in data type specific
comparisons, we must explicitly include a second criteria to account for the possibly of
the Data End Type being Null. The Access Query Design Grid then looks like this:

Data Begin Date Data End Date

criteria <= [Report End Date] >=[Report Begin Date]
criteria <= [Report End Date] Is Null

The design technique used in the FDID Application is to replace the “Report Begin Date”
and “Report End Date” statements in the Access Query Design Grid with references to
the values of Controls on the Report Date Range Input form held open, but not visible,
during report creation. As long as all underlying queries for reports include this logic,



only the correct assignment records will be selected by the queries and submitted to the
report subsystem for report creation.



Endnotes
1
Chen, P.P.; "The Entity-Relationship Model: A Basis for the Enterprise View of Data"; Proceedings of
the 1977 National Computer Conference; Dallas, Texas; AFIPS Conference Proceedings, Volume 46 and
Chen, P.P.; "The Entity-Relationship Model: Toward a Unified View of Data"; ACM Transactions on
Database Systems, Volume 1, Number 1; March 1976.
2
Codd, E.F.; "Extending the Database Relational Model to Capture More Meaning"; SIGMOD
Proceedings; Boston, Mass.; June, 1979.
3
http://www.cs.arizona.edu/people/rts/ and
Snodgrass, Richard T.; Developing Time-Oriented Database Applications in SQL; Morgan Kaufmann,
San Francisco, California; 2000.
4
“Thing” Entities versus “Event” Entities. Entity-sets can represent either “things or “events”. Events
differ from Things in that they exist only at a point in time; Things have duration. An Event is what it is
because it represents the recording of an action against some Thing that changes the state of the Thing. For
example, a Transaction is an action against the current balance of an Account. There are several important
points: 1) the Transaction Event occurs at a single point of time from the point of view of the acted upon
Thing, the Account. 2) Without the Thing, Account, the Transaction Event has no meaning - more
formally, the Transaction has both existence and identity dependence upon the Account. The first point
means that an Event Entity has no duration, not “history,” as it exists at only a single point of time. The
second point, the Event Entity having both existence and identity dependence upon the Thing Entity means
that it can be related to one and only one of the “Things” that gives it identity. This combination makes a
“time varying relationship” between the Event Entity and the Thing Entity impossible; if the relationship
were to vary over time, the identity of the Event Entity would also change. For example, we cannot
meaningfully talk about “changing” the Account a Transaction is related to. We can only “correct” a
mistaken assignment of a Transaction to the wrong Account. This is why when we “transfer” funds from
one Account to another, there are two Transactions - a debit to one Account and a credit to the other
Account. This inability to have a time varying relationship between an Event and its Thing allows us to
forego having a separate Assignment table representing the relationship between a Transaction and it
Account; we “bury” the Account‟s key in the Transaction as a required foreign key attribute. This also,
incidentally, further demonstrates the Transaction‟s identity dependency upon the Account. The same
reasoning applies to the other Event Entity-set in the Application, Action.


Temporal Case Management 1998

More Related Content

What's hot (20)

Similar to Temporal Case Management 1998 (20)

Temporal Case Management 1998