The effect of number of concepts on readability of schemas 2

The effect of the number
of concepts on the
readability
of schemas

hello!
I am Your Name
I am here because I love to give presentations.
You can find me at @username

1.
Introduction
An Empirical Study with Data Models

“A Conceptual Model is a representation of a system, made of the composition of
concepts which are used to help people know, understand, or simulate a subject
the model represents. Conceptual model schemas are used as,
(a) A method of either informally or formally documenting end-user
requirements.
(b) A method of optimally designing the subsequent IS.
A commonly used example of both (a) and (b) is the use of The Entity-
relationship Model (ERM) to capture end-user requirements for constructing
a relational database application.

Dependent variables in earlier empirical work
A survey of the literature on the evaluation of modeling methods reveals several desirable
attributes for conceptual modeling methods, which have been used as dependent
variables in past empirical studies. These include,
a) The adequacy or completeness of the modeling method in being able to represent
the underlying reality.
b) The readability of the modeling method’s schemas.
c) How easy it is to use the modeling method to Represent Requirements.

Independent variables in earlier empirical work
a) The first independent variable is the level of experience and familiarity of the subjects
with the conceptual model used.
b) Past studies have attempted to control for the level of familiarity with the domain by
utilizing domains that are reasonably familiar to all subjects, and further by randomly
allocating subjects across treatment levels.
c) A third variable is the underlying complexity of the requirements for a particular
situation, where a more complex set of requirements is harder to reconstruct than a
simpler set.

Illustrative past work on the readability of conceptual models
Study Independent Variables Measures Results
Brosey and
Schneiderman [9]
(a) Hierarchical vs.
relational models and
(b) User experience
Questions on domain
Hierarchical schemas were easier
to read by novice users
Juhn and
Naumann [17]
Semantic vs.
non-semantic models
Questions on domain
Semantic models subjects
identified
relationships and cardinalities
better
Palvia et al. [27] O–O vs. non O–O Questions on domain O–O subjects performed better
Shoval and
Frummerman [35]
EER vs. OO
True/false questions
on domain
EER subjects interpreted ternary
relationships more correctly
Hardgrave and
Dalal [16]
EER vs. OMT
Ability to understand and
time to understand
OO subjects were significantly
faster at
answering questions than EER
subjects
Peleg and Dori [29] OPM/T vs. OMT/T
True/false questions on
domain
OPM/T subjects better at
comprehension

2.
Operationalization of
variables and research
model

Operationalization of Variables
The independent variable in this work is the
NOC in a model, which we define to be a
numerical count of the number of distinct
syntactic constructs in a model. Increasing
the NOC in a model makes it harder to
create model schemas from an analyst
standpoint, but enables it to capture more
elements of the underlying domain.
NOC
Number of Concepts
REF
Readability Effectiveness
REN
Readability Efficiency
RLN (REF)
Slope of Effectiveness
RLN (REN)
Slope of Efficiency

Dimensions of Readability
Readability Effectiveness (REF)
The percentage of correct
answers given when questions
are asked about the domain.
No. of Correct Answers x 100
REF =
Total No. of Questions
Readability Efficiency (REN)
The inverse of the time it takes
to answer questions regarding
schemas.
1
REN =
t
Readability Learnability (RLN)
The improvement in the
dimensions of effectiveness
and efficiency of readability
over successive tasks.
d(REF) d(REF)
RLN(REF) = =
dxRLN(REF) dx

Hypothesis
1) Hypothesis 1 (H1):
A higher NOC will lead to a higher REF.
2) Hypothesis 2 (H2):
A higher NOC will lead to a lower REN.
3) Hypothesis 3 (H3)
H3(a):
A higher NOC will lead to a lower RLN(REF).
H3(b):
A higher NOC will lead to a lower RLN(REN).

Subject Selection
The subjects for this study were MIS masters level students in a university based in
northeastern USA. All the subjects were in the age range 22–30, and had one year of
experience (two courses) in using conceptual data models, with no previous usage of
conceptual data models in the work place. As such, the subjects in this study represent
beginner professional level systems analysts.

Independent & dependent variables
in the experiment
In order to maintain internal validity, two versions of the ERM are selected. The sets of
concepts in each version are shown below.
✘ ERM base-level-NOC = {entity sets, relationship sets, attributes of entity sets, attributes of
relationship sets, primary keys of entity sets}
✘ ERM higher-level-NOC = {entity sets, relation sets, attributes of entity sets, attributes of
relationship sets, primary keys of entity sets, cardinalities of relationship sets,
inheritance, optionality of relationship sets, weak entity sets}

Baseline NOC Model Library Schema

Higher Level NOC Model Library Schema

Experimental Process
✘ Subjects were first refreshed regarding the concepts behind the ERM that they would
be using. For each model, the instruction was stopped after all the subjects indicated
that they were comfortable with the model.
✘ After the instruction, the subjects were told they would be given four schemas, one
after the other. As an incentive, the subjects were offered $20 to participate in the
study, an additional $8 if they had a correct score of over 90%, and $2 if they finished
the study in less than 60% of the maximum time that was allocated.
✘ The entire protocol was pilot tested and found to work adequately.

Data Analysis
✘ Table 2 shows the raw REF and REN scores for each
experimental group, across the four domains.

Data Analysis
✘ A 2-tailed t-test analysis of the difference between
the mean REF and REN scores of the two groups
across the four domains is shown in Table 3.
✘ Table 4 shows the raw slope scores for the REF and
REN for the two models.
✘ A 2-tailed t-test analysis of the difference between
the mean REF and REN slopes of the two groups is
shown in Table 5.

Let’s review some concepts
1
The support for H1 implies that models
with more concepts will lead to more
accurate conceptualizations of the
underlying domain.
2
The significant support for H2 implies
that adding more concepts to a model
increases the amount of time it takes
to map back to the problem.
3
The support for the contra of H3(a) at
the 10% level indicates that the learn
ability of models with a larger NOC is
faster than those with a fewer number.
4
The lack of support for H3(b) indicates,
the reduction in the time required is not
affected by the NOC in the model. And
no significant difference is found.
5
This Study takes a step towards
empirical validation of the theory
regarding the readability of conceptual
models.
6
The Study extends the
operationalization of the readability
construct to incorporate learn ability,
effectiveness, and efficiency.

thanks!
Any questions?
You can find me at
@username
user@mailme

The effect of number of concepts on readability of schemas 2

More Related Content

What's hot (20)

Similar to The effect of number of concepts on readability of schemas 2 (20)

More from Saman Sara (17)

Recently uploaded (20)

The effect of number of concepts on readability of schemas 2