
Solving problems on the example of
the use of pharmacy pharmacy
database “Pharmacy”.Software to
create presentations and office basics
of programming.

This page intentionally left blank

chapter
7
Data Modeling Using the
Entity-Relationship (ER) Model
onceptual modeling is a very important phase in
C
designing a successful database application.
Generally, the term database application refers to a particular database and the
associated programs that implement the database queries and updates. For exam-
ple, a BANK database application that keeps track of customer accounts would
include programs that implement database updates corresponding to customer
deposits and withdrawals. These programs provide user-friendly graphical user
interfaces (GUIs) utilizing forms and menus for the end users of the application—
the bank tellers, in this example. Hence, a major part of the database application will
require the design, implementation, and testing of these application programs.
Traditionally, the design and testing of application programs has been considered
to be part of software engineering rather than database design. In many software
design tools, the database design methodologies and software engineering method-
ologies are intertwined since these activities are strongly related.
In this chapter, we follow the traditional approach of concentrating on the database
structures and constraints during conceptual database design. The design of appli-
cation programs is typically covered in software engineering courses. We present the
modeling concepts of the Entity-Relationship (ER) model, which is a popular
high-level conceptual data model. This model and its variations are frequently used
for the conceptual design of database applications, and many database design tools
employ its concepts. We describe the basic data-structuring concepts and con-
straints of the ER model and discuss their use in the design of conceptual schemas
for database applications. We also present the diagrammatic notation associated
with the ER model, known as ER diagrams.
199

200
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
Object modeling methodologies such as the Unified Modeling Language (UML)
are becoming increasingly popular in both database and software design. These
methodologies go beyond database design to specify detailed design of software
modules and their interactions using various types of diagrams. An important part
of these methodologies—namely, class diagrams1—are similar in many ways to the
ER diagrams. In class diagrams, operations on objects are specified, in addition to
specifying the database schema structure. Operations can be used to specify the
functional requirements during database design, as we will discuss in Section 7.1. We
present some of the UML notation and concepts for class diagrams that are partic-
ularly relevant to database design in Section 7.8, and briefly compare these to ER
notation and concepts. Additional UML notation and concepts are presented in
Section 8.6 and in Chapter 10.
This chapter is organized as follows: Section 7.1 discusses the role of high-level con-
ceptual data models in database design. We introduce the requirements for a sample
database application in Section 7.2 to illustrate the use of concepts from the ER
model. This sample database is also used throughout the book. In Section 7.3 we
present the concepts of entities and attributes, and we gradually introduce the dia-
grammatic technique for displaying an ER schema. In Section 7.4 we introduce the
concepts of binary relationships and their roles and structural constraints. Section
7.5 introduces weak entity types. Section 7.6 shows how a schema design is refined
to include relationships. Section 7.7 reviews the notation for ER diagrams, summa-
rizes the issues and common pitfalls that occur in schema design, and discusses how
to choose the names for database schema constructs. Section 7.8 introduces some
UML class diagram concepts, compares them to ER model concepts, and applies
them to the same database example. Section 7.9 discusses more complex types of
relationships. Section 7.10 summarizes the chapter.
The material in Sections 7.8 and 7.9 may be excluded from an introductory course. If
a more thorough coverage of data modeling concepts and conceptual database design
is desired, the reader should continue to Chapter 8, where we describe extensions to
the ER model that lead to the Enhanced-ER (EER) model, which includes concepts
such as specialization, generalization, inheritance, and union types (categories). We
also introduce some additional UML concepts and notation in Chapter 8.
7.1 Using High-Level Conceptual Data Models
for Database Design
Figure 7.1 shows a simplified overview of the database design process. The first step
shown is requirements collection and analysis. During this step, the database
designers interview prospective database users to understand and document their
data requirements. The result of this step is a concisely written set of users’ require-
ments. These requirements should be specified in as detailed and complete a form
as possible. In parallel with specifying the data requirements, it is useful to specify
1A class is similar to an entity type in many ways.

7.1
Using High-Level Conceptual Data Models for Database Design
201
Miniworld
REQUIREMENTS
COLLECTION AND
ANALYSIS
Functional Requirements
Data Requirements
FUNCTIONAL ANALYSIS
CONCEPTUAL DESIGN
High-Level Transaction
Conceptual Schema
Specification
(In a high-level data model)
DBMS-independent
LOGICAL DESIGN
DBMS-specific
(DATA MODEL MAPPING)
Logical (Conceptual) Schema
APPLICATION PROGRAM
(In the data model of a specific DBMS)
DESIGN
PHYSICAL DESIGN
TRANSACTION
Internal Schema
IMPLEMENTATION
Figure 7.1
A simplified diagram to illustrate the
Application Programs
main phases of database design.
the known functional requirements of the application. These consist of the user-
defined operations (or transactions) that will be applied to the database, including
both retrievals and updates. In software design, it is common to use data flow dia-
grams, sequence diagrams, scenarios, and other techniques to specify functional
requirements. We will not discuss any of these techniques here; they are usually
described in detail in software engineering texts. We give an overview of some of
these techniques in Chapter 10.
Once the requirements have been collected and analyzed, the next step is to create a
conceptual schema for the database, using a high-level conceptual data model. This
step is called conceptual design. The conceptual schema is a concise description of

202
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
the data requirements of the users and includes detailed descriptions of the entity
types, relationships, and constraints; these are expressed using the concepts pro-
vided by the high-level data model. Because these concepts do not include imple-
mentation details, they are usually easier to understand and can be used to
communicate with nontechnical users. The high-level conceptual schema can also
be used as a reference to ensure that all users’ data requirements are met and that the
requirements do not conflict. This approach enables database designers to concen-
trate on specifying the properties of the data, without being concerned with storage
and implementation details. This makes it is easier to create a good conceptual data-
base design.
During or after the conceptual schema design, the basic data model operations can
be used to specify the high-level user queries and operations identified during func-
tional analysis. This also serves to confirm that the conceptual schema meets all the
identified functional requirements. Modifications to the conceptual schema can be
introduced if some functional requirements cannot be specified using the initial
schema.
The next step in database design is the actual implementation of the database, using
a commercial DBMS. Most current commercial DBMSs use an implementation
data model—such as the relational or the object-relational database model—so the
conceptual schema is transformed from the high-level data model into the imple-
mentation data model. This step is called logical design or data model mapping; its
result is a database schema in the implementation data model of the DBMS. Data
model mapping is often automated or semiautomated within the database design
tools.
The last step is the physical design phase, during which the internal storage struc-
tures, file organizations, indexes, access paths, and physical design parameters for
the database files are specified. In parallel with these activities, application programs
are designed and implemented as database transactions corresponding to the high-
level transaction specifications. We discuss the database design process in more
detail in Chapter 10.
We present only the basic ER model concepts for conceptual schema design in this
chapter. Additional modeling concepts are discussed in Chapter 8, when we intro-
duce the EER model.
7.2 A Sample Database Application
In this section we describe a sample database application, called COMPANY, which
serves to illustrate the basic ER model concepts and their use in schema design. We
list the data requirements for the database here, and then create its conceptual
schema step-by-step as we introduce the modeling concepts of the ER model. The
COMPANY database keeps track of a company’s employees, departments, and proj-
ects. Suppose that after the requirements collection and analysis phase, the database
designers provide the following description of the miniworld—the part of the com-
pany that will be represented in the database.

7.3
Entity Types, Entity Sets, Attributes, and Keys
203
■ The company is organized into departments. Each department has a unique
name, a unique number, and a particular employee who manages the
department. We keep track of the start date when that employee began man-
aging the department. A department may have several locations.
■ A department controls a number of projects, each of which has a unique
name, a unique number, and a single location.
■ We store each employee’s name, Social Security number,2 address, salary, sex
(gender), and birth date. An employee is assigned to one department, but
may work on several projects, which are not necessarily controlled by the
same department. We keep track of the current number of hours per week
that an employee works on each project. We also keep track of the direct
supervisor of each employee (who is another employee).
■ We want to keep track of the dependents of each employee for insurance
purposes. We keep each dependent’s first name, sex, birth date, and relation-
ship to the employee.
Figure 7.2 shows how the schema for this database application can be displayed by
means of the graphical notation known as ER diagrams. This figure will be
explained gradually as the ER model concepts are presented. We describe the step-
by-step process of deriving this schema from the stated requirements—and explain
the ER diagrammatic notation—as we introduce the ER model concepts.
7.3 Entity Types, Entity Sets, Attributes,
and Keys
The ER model describes data as entities, relationships, and attributes. In Section 7.3.1
we introduce the concepts of entities and their attributes. We discuss entity types
and key attributes in Section 7.3.2. Then, in Section 7.3.3, we specify the initial con-
ceptual design of the entity types for the COMPANY database. Relationships are
described in Section 7.4.
7.3.1
Entities and Attributes
Entities and Their Attributes. The basic object that the ER model represents is
an entity, which is a thing in the real world with an independent existence. An entity
may be an object with a physical existence (for example, a particular person, car,
house, or employee) or it may be an object with a conceptual existence (for instance,
a company, a job, or a university course). Each entity has attributes—the particular
properties that describe it. For example, an EMPLOYEE entity may be described by
the employee’s name, age, address, salary, and job. A particular entity will have a
2The Social Security number, or SSN, is a unique nine-digit identifier assigned to each individual in the
United States to keep track of his or her employment, benefits, and taxes. Other countries may have
similar identification schemes, such as personal identification card numbers.

204
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
Fname
Minit
Lname
Bdate
Name
Address
Salary
Ssn
Sex
Locations
N
1
WORKS_FOR
Name
Number
EMPLOYEE
Start_date
Number_of_employees
DEPARTMENT
1
1
1
MANAGES
CONTROLS
Hours
N
M
N
WORKS_ON
PROJECT
Supervisor
Supervisee
1
1
Name
SUPERVISION
N
Location
Number
DEPENDENTS_OF
N
DEPENDENT
Name
Sex
Birth_date
Relationship
Figure 7.2
An ER schema diagram for the COMPANY database. The diagrammatic notation
is introduced gradually throughout this chapter and is summarized in Figure 7.14.
value for each of its attributes. The attribute values that describe each entity become
a major part of the data stored in the database.
Figure 7.3 shows two entities and the values of their attributes. The EMPLOYEE
entity e1 has four attributes: Name, Address, Age, and Home_phone; their values are
‘John Smith,’ ‘2311 Kirby, Houston, Texas 77001’, ‘55’, and ‘713-749-2630’, respec-
tively. The COMPANY entity c1 has three attributes: Name, Headquarters, and
President; their values are ‘Sunco Oil’, ‘Houston’, and ‘John Smith’, respectively.
Several types of attributes occur in the ER model: simple versus composite, single-
valued versus multivalued, and stored versus derived. First we define these attribute

7.3
Entity Types, Entity Sets, Attributes, and Keys
205
Name = Sunco Oil
Name = John Smith
Address = 2311 Kirby
Houston, Texas 77001
e1
c1
Headquarters = Houston
Figure 7.3
Age = 55
Two entities,
EMPLOYEE e1, and
COMPANY c1, and
Home_phone = 713-749-2630
President = John Smith
their attributes.
types and illustrate their use via examples. Then we discuss the concept of a NULL
value for an attribute.
Composite versus Simple (Atomic) Attributes. Composite attributes can be
divided into smaller subparts, which represent more basic attributes with indepen-
dent meanings. For example, the Address attribute of the EMPLOYEE entity shown
in Figure 7.3 can be subdivided into Street_address, City, State, and Zip,3 with the
values ‘2311 Kirby’, ‘Houston’, ‘Texas’, and ‘77001.’ Attributes that are not divisible
are called simple or atomic attributes. Composite attributes can form a hierarchy;
for example, Street_address can be further subdivided into three simple component
attributes: Number, Street, and Apartment_number, as shown in Figure 7.4. The value
of a composite attribute is the concatenation of the values of its component simple
attributes.
Composite attributes are useful to model situations in which a user sometimes
refers to the composite attribute as a unit but at other times refers specifically to its
components. If the composite attribute is referenced only as a whole, there is no
Address
Figure 7.4
A hierarchy of composite
attributes.
Street_address
City
State
Zip
Number
Street
Apartment_number
3Zip Code is the name used in the United States for a five-digit postal code, such as 76019, which can
be extended to nine digits, such as 76019-0015. We use the five-digit Zip in our examples.

206
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
need to subdivide it into component attributes. For example, if there is no need to
refer to the individual components of an address (Zip Code, street, and so on), then
the whole address can be designated as a simple attribute.
Single-Valued versus Multivalued Attributes. Most attributes have a single
value for a particular entity; such attributes are called single-valued. For example,
Age is a single-valued attribute of a person. In some cases an attribute can have a set
of values for the same entity—for instance, a Colors attribute for a car, or a
College_degrees attribute for a person. Cars with one color have a single value,
whereas two-tone cars have two color values. Similarly, one person may not have a
college degree, another person may have one, and a third person may have two or
more degrees; therefore, different people can have different numbers of values for
the College_degrees attribute. Such attributes are called multivalued. A multivalued
attribute may have lower and upper bounds to constrain the number of values
allowed for each individual entity. For example, the Colors attribute of a car may be
restricted to have between one and three values, if we assume that a car can have
three colors at most.
Stored versus Derived Attributes. In some cases, two (or more) attribute val-
ues are related—for example, the Age and Birth_date attributes of a person. For a
particular person entity, the value of Age can be determined from the current
(today’s) date and the value of that person’s Birth_date. The Age attribute is hence
called a derived attribute and is said to be derivable from the Birth_date attribute,
which is called a stored attribute. Some attribute values can be derived from
related entities; for example, an attribute Number_of_employees of a DEPARTMENT
entity can be derived by counting the number of employees related to (working
for) that department.
NULL Values. In some cases, a particular entity may not have an applicable value
for an attribute. For example, the Apartment_number attribute of an address applies
only to addresses that are in apartment buildings and not to other types of resi-
dences, such as single-family homes. Similarly, a College_degrees attribute applies
only to people with college degrees. For such situations, a special value called NULL
is created. An address of a single-family home would have NULL for its
Apartment_number attribute, and a person with no college degree would have NULL
for College_degrees. NULL can also be used if we do not know the value of an attrib-
ute for a particular entity—for example, if we do not know the home phone num-
ber of ‘John Smith’ in Figure 7.3. The meaning of the former type of NULL is not
applicable, whereas the meaning of the latter is unknown. The unknown category of
NULL can be further classified into two cases. The first case arises when it is known
that the attribute value exists but is missing—for instance, if the Height attribute of a
person is listed as NULL. The second case arises when it is not known whether the
attribute value exists—for example, if the Home_phone attribute of a person is NULL.
Complex Attributes. Notice that, in general, composite and multivalued attrib-
utes can be nested arbitrarily. We can represent arbitrary nesting by grouping com-

7.3
Entity Types, Entity Sets, Attributes, and Keys
207
ponents of a composite attribute between parentheses () and separating the compo-
nents with commas, and by displaying multivalued attributes between braces { }.
Such attributes are called complex attributes. For example, if a person can have
more than one residence and each residence can have a single address and multiple
phones, an attribute Address_phone for a person can be specified as shown in Figure
7.5.4 Both Phone and Address are themselves composite attributes.
7.3.2
Entity Types, Entity Sets, Keys, and Value Sets
Entity Types and Entity Sets. A database usually contains groups of entities that
are similar. For example, a company employing hundreds of employees may want to
store similar information concerning each of the employees. These employee entities
share the same attributes, but each entity has its own value(s) for each attribute. An
entity type defines a collection (or set) of entities that have the same attributes. Each
entity type in the database is described by its name and attributes. Figure 7.6 shows
two entity types: EMPLOYEE and COMPANY, and a list of some of the attributes for
{Address_phone( {Phone(Area_code,Phone_number)},Address(Street_address
Figure 7.5
(Number,Street,Apartment_number),City,State,Zip) )}
A complex attribute:
Address_phone.
Entity Type Name:
EMPLOYEE
COMPANY
Figure 7.6
Two entity types,
Name, Age, Salary
Name, Headquarters, President
EMPLOYEE and
COMPANY, and some
e1
c1
member entities of
each.
(John Smith, 55, 80k)
(Sunco Oil, Houston, John Smith)
e2
c2
Entity Set:
(Fred Brown, 40, 30K)
(Fast Computer, Dallas, Bob King)
(Extension)
e3
(Judy Clark, 25, 20K)
4For those familiar with XML, we should note that complex attributes are similar to complex elements in
XML (see Chapter 12).

208
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
each. A few individual entities of each type are also illustrated, along with the values
of their attributes. The collection of all entities of a particular entity type in the data-
base at any point in time is called an entity set; the entity set is usually referred to
using the same name as the entity type. For example, EMPLOYEE refers to both a type
of entity as well as the current set of all employee entities in the database.
An entity type is represented in ER diagrams5 (see Figure 7.2) as a rectangular box
enclosing the entity type name. Attribute names are enclosed in ovals and are
attached to their entity type by straight lines. Composite attributes are attached to
their component attributes by straight lines. Multivalued attributes are displayed in
double ovals. Figure 7.7(a) shows a CAR entity type in this notation.
An entity type describes the schema or intension for a set of entities that share the
same structure. The collection of entities of a particular entity type is grouped into
an entity set, which is also called the extension of the entity type.
Key Attributes of an Entity Type. An important constraint on the entities of an
entity type is the key or uniqueness constraint on attributes. An entity type usually
Figure 7.7
(a)
State
Number
The CAR entity type
with two key attributes,
Registration
Vehicle_id
Registration and
Vehicle_id. (a) ER
diagram notation. (b)
Year
CAR
Model
Entity set with three
entities.
Color
Make
(b)
CAR
Registration (Number, State), Vehicle_id, Make, Model, Year, {Color}
CAR1
((ABC 123, TEXAS), TK629, Ford Mustang, convertible, 2004 {red, black})
CAR2
((ABC 123, NEW YORK), WP9872, Nissan Maxima, 4-door, 2005, {blue})
CAR3
((VSY 720, TEXAS), TD729, Chrysler LeBaron, 4-door, 2002, {white, blue})
5We use a notation for ER diagrams that is close to the original proposed notation (Chen 1976). Many
other notations are in use; we illustrate some of them later in this chapter when we present UML class
diagrams and in Appendix A.

7.3
Entity Types, Entity Sets, Attributes, and Keys
209
has one or more attributes whose values are distinct for each individual entity in the
entity set. Such an attribute is called a key attribute, and its values can be used to
identify each entity uniquely. For example, the Name attribute is a key of the
COMPANY entity type in Figure 7.6 because no two companies are allowed to have
the same name. For the PERSON entity type, a typical key attribute is Ssn (Social
Security number). Sometimes several attributes together form a key, meaning that
the combination of the attribute values must be distinct for each entity. If a set of
attributes possesses this property, the proper way to represent this in the ER model
that we describe here is to define a composite attribute and designate it as a key
attribute of the entity type. Notice that such a composite key must be minimal; that
is, all component attributes must be included in the composite attribute to have the
uniqueness property. Superfluous attributes must not be included in a key. In ER
diagrammatic notation, each key attribute has its name underlined inside the oval,
as illustrated in Figure 7.7(a).
Specifying that an attribute is a key of an entity type means that the preceding
uniqueness property must hold for every entity set of the entity type. Hence, it is a
constraint that prohibits any two entities from having the same value for the key
attribute at the same time. It is not the property of a particular entity set; rather, it is
a constraint on any entity set of the entity type at any point in time. This key con-
straint (and other constraints we discuss later) is derived from the constraints of the
miniworld that the database represents.
Some entity types have more than one key attribute. For example, each of the
Vehicle_id and Registration attributes of the entity type CAR (Figure 7.7) is a key in its
own right. The Registration attribute is an example of a composite key formed from
two simple component attributes, State and Number, neither of which is a key on its
own. An entity type may also have no key, in which case it is called a weak entity type
(see Section 7.5).
In our diagrammatic notation, if two attributes are underlined separately, then each
is a key on its own. Unlike the relational model (see Section 3.2.2), there is no con-
cept of primary key in the ER model that we present here; the primary key will be
chosen during mapping to a relational schema (see Chapter 9).
Value Sets (Domains) of Attributes. Each simple attribute of an entity type is
associated with a value set (or domain of values), which specifies the set of values
that may be assigned to that attribute for each individual entity. In Figure 7.6, if the
range of ages allowed for employees is between 16 and 70, we can specify the value
set of the Age attribute of EMPLOYEE to be the set of integer numbers between 16
and 70. Similarly, we can specify the value set for the Name attribute to be the set of
strings of alphabetic characters separated by blank characters, and so on. Value sets
are not displayed in ER diagrams, and are typically specified using the basic data
types available in most programming languages, such as integer, string, Boolean,
float, enumerated type, subrange, and so on. Additional data types to represent
common database types such as date, time, and other concepts are also employed.

210
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
Mathematically, an attribute A of entity set E whose value set is V can be defined as
a function from E to the power set6 P(V ) of V:
A : E → P(V)
We refer to the value of attribute A for entity e as A(e). The previous definition cov-
ers both single-valued and multivalued attributes, as well as NULLs. A NULL value is
represented by the empty set. For single-valued attributes, A(e) is restricted to being
a singleton set for each entity e in E, whereas there is no restriction on multivalued
attributes.7 For a composite attribute A, the value set V is the power set of the
Cartesian product of P(V1), P(V2), …, P(Vn), where V1, V2, …, Vn are the value sets
of the simple component attributes that form A:
V = P (P(V1) × P(V2) × … × P(Vn))
The value set provides all possible values. Usually only a small number of these val-
ues exist in the database at a particular time. Those values represent the data from
the current state of the miniworld. They correspond to the data as it actually exists
in the miniworld.
7.3.3
Initial Conceptual Design of the COMPANY Database
We caow define the entity types for the COMPANY database, based on the
requirements described in Section 7.2. After defining several entity types and their
attributes here, we refine our design in Section 7.4 after we introduce the concept of
a relationship. According to the requirements listed in Section 7.2, we can identify
four entity types—one corresponding to each of the four items in the specification
(see Figure 7.8):
1. An entity type DEPARTMENT with attributes Name, Number, Locations,
Manager, and Manager_start_date. Locations is the only multivalued attribute.
We can specify that both Name and Number are (separate) key attributes
because each was specified to be unique.
2. An entity type PROJECT with attributes Name, Number, Location, and
Controlling_department. Both Name and Number are (separate) key attributes.
3. An entity type EMPLOYEE with attributes Name, Ssn, Sex, Address, Salary,
Birth_date, Department, and Supervisor. Both Name and Address may be com-
posite attributes; however, this was not specified in the requirements. We
must go back to the users to see if any of them will refer to the individual
components of Name—First_name, Middle_initial, Last_name—or of Address.
4. An entity type DEPENDENT with attributes Employee, Dependent_name, Sex,
Birth_date, and Relationship (to the employee).
6The power set P (V ) of a set V is the set of all subsets of V.
7A singleton set is a set with only one element (value).

7.3
Entity Types, Entity Sets, Attributes, and Keys
211
Name
Number
Locations
DEPARTMENT
Manager
Manager_start_date
Name
Number
Location
PROJECT
Controlling_department
Fname
Minit
Lname
Project
Hours
Sex
Name
Works_on
Ssn
Salary
Department
Supervisor
EMPLOYEE
Birth_date
Address
Figure 7.8
Birth_date
Sex
Employee
Preliminary design of entity types
for the COMPANY database.
Relationship
Dependent_name
Some of the shown attributes will
DEPENDENT
be refined into relationships.
So far, we have not represented the fact that an employee can work on several proj-
ects, nor have we represented the number of hours per week an employee works on
each project. This characteristic is listed as part of the third requirement in Section
7.2, and it can be represented by a multivalued composite attribute of EMPLOYEE
called Works_on with the simple components (Project, Hours). Alternatively, it can be
represented as a multivalued composite attribute of PROJECT called Workers with
the simple components (Employee, Hours). We choose the first alternative in Figure
7.8, which shows each of the entity types just described. The Name attribute of
EMPLOYEE is shown as a composite attribute, presumably after consultation with
the users.

212
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
7.4 Relationship Types, Relationship Sets,
Roles, and Structural Constraints
In Figure 7.8 there are several implicit relationships among the various entity types.
In fact, whenever an attribute of one entity type refers to another entity type, some
relationship exists. For example, the attribute Manager of DEPARTMENT refers to an
employee who manages the department; the attribute Controlling_department of
PROJECT refers to the department that controls the project; the attribute Supervisor
of EMPLOYEE refers to another employee (the one who supervises this employee);
the attribute Department of EMPLOYEE refers to the department for which the
employee works; and so on. In the ER model, these references should not be repre-
sented as attributes but as relationships, which are discussed in this section. The
COMPANY database schema will be refined in Section 7.6 to represent relationships
explicitly. In the initial design of entity types, relationships are typically captured in
the form of attributes. As the design is refined, these attributes get converted into
relationships between entity types.
This section is organized as follows: Section 7.4.1 introduces the concepts of rela-
tionship types, relationship sets, and relationship instances. We define the concepts
of relationship degree, role names, and recursive relationships in Section 7.4.2, and
then we discuss structural constraints on relationships—such as cardinality ratios
and existence dependencies—in Section 7.4.3. Section 7.4.4 shows how relationship
types can also have attributes.
7.4.1
Relationship Types, Sets, and Instances
A relationship type R among n entity types E1, E2, …, En defines a set of associa-
tions—or a relationship set—among entities from these entity types. As for the
case of entity types and entity sets, a relationship type and its corresponding rela-
tionship set are customarily referred to by the same name, R. Mathematically, the
relationship set R is a set of relationship instances ri, where each ri associates n
individual entities (e1, e2, …, en), and each entity ej in ri is a member of entity set Ej,
1 f j f n. Hence, a relationship set is a mathematical relation on E1, E2, …, En; alter-
natively, it can be defined as a subset of the Cartesian product of the entity sets E1 ×
E2 × … × En. Each of the entity types E1, E2, …, En is said to participate in the rela-
tionship type R; similarly, each of the individual entities e1, e2, …, en is said to
participate in the relationship instance ri = (e1, e2, …, en).
Informally, each relationship instance ri in R is an association of entities, where the
association includes exactly one entity from each participating entity type. Each
such relationship instance ri represents the fact that the entities participating in ri
are related in some way in the corresponding miniworld situation. For example,
consider a relationship type WORKS_FOR between the two entity types EMPLOYEE
and DEPARTMENT, which associates each employee with the department for which
the employee works in the corresponding entity set. Each relationship instance in
the relationship set WORKS_FOR associates one EMPLOYEE entity and one
DEPARTMENT entity. Figure 7.9 illustrates this example, where each relationship

7.4
Relationship Types, Relationship Sets, Roles, and Structural Constraints
213
EMPLOYEE
WORKS_FOR
DEPARTMENT
r1
e1
d1
r2
e2
d2
e3
r3
d3
e4
r4
e5
r5
e6
e7
Figure 7.9
r6
Some instances in the
WORKS_FOR relationship
r7
set, which represents a
relationship type
WORKS_FOR between
EMPLOYEE and
DEPARTMENT.
instance ri is shown connected to the EMPLOYEE and DEPARTMENT entities that
participate in ri. In the miniworld represented by Figure 7.9, employees e1, e3, and e6
work for department d1; employees e2 and e4 work for department d2; and employ-
ees e5 and e7 work for department d3.
In ER diagrams, relationship types are displayed as diamond-shaped boxes, which
are connected by straight lines to the rectangular boxes representing the participat-
ing entity types. The relationship name is displayed in the diamond-shaped box (see
Figure 7.2).
7.4.2
Relationship Degree, Role Names,
and Recursive Relationships
Degree of a Relationship Type. The degree of a relationship type is the number
of participating entity types. Hence, the WORKS_FOR relationship is of degree two.
A relationship type of degree two is called binary, and one of degree three is called
ternary. An example of a ternary relationship is SUPPLY, shown in Figure 7.10,
where each relationship instance ri associates three entities—a supplier s, a part p,
and a project j—whenever s supplies part p to project j. Relationships can generally
be of any degree, but the ones most common are binary relationships. Higher-
degree relationships are generally more complex than binary relationships; we char-
acterize them further in Section 7.9.

214
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
SUPPLIER
SUPPLY
PROJECT
s1
r1
j1
s2
r2
j2
r3
j3
r4
PART
r5
p1
p2
r6
p3
r7
Figure 7.10
Some relationship instances in
the SUPPLY ternary relationship
set.
Relationships as Attributes. It is sometimes convenient to think of a binary
relationship type in terms of attributes, as we discussed in Section 7.3.3. Consider
the WORKS_FOR relationship type in Figure 7.9. One can think of an attribute
called Department of the EMPLOYEE entity type, where the value of Department for
each EMPLOYEE entity is (a reference to) the DEPARTMENT entity for which that
employee works. Hence, the value set for this Department attribute is the set of all
DEPARTMENT entities, which is the DEPARTMENT entity set. This is what we did in
Figure 7.8 when we specified the initial design of the entity type EMPLOYEE for the
COMPANY database. However, when we think of a binary relationship as an attrib-
ute, we always have two options. In this example, the alternative is to think of a mul-
tivalued attribute Employee of the entity type DEPARTMENT whose values for each
DEPARTMENT entity is the set of EMPLOYEE entities who work for that department.
The value set of this Employee attribute is the power set of the EMPLOYEE entity set.
Either of these two attributes—Department of EMPLOYEE or Employee of
DEPARTMENT—can represent the WORKS_FOR relationship type. If both are repre-
sented, they are constrained to be inverses of each other.8
8This concept of representing relationship types as attributes is used in a class of data models called
functional data models. In object databases (see Chapter 11), relationships can be represented by ref-
erence attributes, either in one direction or in both directions as inverses. In relational databases (see
Chapter 3), foreign keys are a type of reference attribute used to represent relationships.

7.4
Relationship Types, Relationship Sets, Roles, and Structural Constraints
215
Role Names and Recursive Relationships. Each entity type that participates
in a relationship type plays a particular role in the relationship. The role name sig-
nifies the role that a participating entity from the entity type plays in each relation-
ship instance, and helps to explain what the relationship means. For example, in the
WORKS_FOR relationship type, EMPLOYEE plays the role of employee or worker and
DEPARTMENT plays the role of department or employer.
Role names are not technically necessary in relationship types where all the partici-
pating entity types are distinct, since each participating entity type name can be
used as the role name. However, in some cases the same entity type participates
more than once in a relationship type in different roles. In such cases the role name
becomes essential for distinguishing the meaning of the role that each participating
entity plays. Such relationship types are called recursive relationships. Figure 7.11
shows an example. The SUPERVISION relationship type relates an employee to a
supervisor, where both employee and supervisor entities are members of the same
EMPLOYEE entity set. Hence, the EMPLOYEE entity type participates twice in
SUPERVISION: once in the role of supervisor (or boss), and once in the role of
supervisee (or subordinate). Each relationship instance ri in SUPERVISION associates
two employee entities ej and ek, one of which plays the role of supervisor and the
other the role of supervisee. In Figure 7.11, the lines marked ‘1’ represent the super-
visor role, and those marked ‘2’ represent the supervisee role; hence, e1 supervises e2
and e3, e4 supervises e6 and e7, and e5 supervises e1 and e4. In this example, each rela-
tionship instance must be connected with two lines, one marked with ‘1’ (supervi-
sor) and the other with ‘2’ (supervisee).
EMPLOYEE
SUPERVISION
Figure 7.11
A recursive relationship
SUPERVISION between
r1
EMPLOYEE in the
e1
2
supervisor role (1) and
1
r2
EMPLOYEE in the
e2
2
subordinate role (2).
e3
1
r3
2
1
e4
2
r4
e5
1
e6
1
r5
2
e7
1
2
r6

216
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
7.4.3
Constraints on Binary Relationship Types
Relationship types usually have certain constraints that limit the possible combina-
tions of entities that may participate in the corresponding relationship set. These
constraints are determined from the miniworld situation that the relationships rep-
resent. For example, in Figure 7.9, if the company has a rule that each employee
must work for exactly one department, then we would like to describe this con-
straint in the schema. We can distinguish two main types of binary relationship
constraints: cardinality ratio and participation.
Cardinality Ratios for Binary Relationships. The cardinality ratio for a binary
relationship specifies the maximum number of relationship instances that an entity
can participate in. For example, in the WORKS_FOR binary relationship type,
DEPARTMENT:EMPLOYEE is of cardinality ratio 1:N, meaning that each department
can be related to (that is, employs) any number of employees,9 but an employee can
be related to (work for) only one department. This means that for this particular
relationship WORKS_FOR, a particular department entity can be related to any
number of employees (N indicates there is no maximum number). On the other
hand, an employee can be related to a maximum of one department. The possible
cardinality ratios for binary relationship types are 1:1, 1:N, N:1, and M:N.
An example of a 1:1 binary relationship is MANAGES (Figure 7.12), which relates a
department entity to the employee who manages that department. This represents
the miniworld constraints that—at any point in time—an employee can manage
one department only and a department can have one manager only. The relation-
ship type WORKS_ON (Figure 7.13) is of cardinality ratio M:N, because the mini-
Figure 7.12
EMPLOYEE
MANAGES
DEPARTMENT
A 1:1 relationship,
MANAGES.
e1
e2
r1
d
1
e3
r2
e4
d2
e5
r3
d
3
e6
e7
9N stands for any number of related entities (zero or more).

7.4
Relationship Types, Relationship Sets, Roles, and Structural Constraints
217
EMPLOYEE
WORKS_ON
PROJECT
e1
r1
p1
e2
r2
p2
e3
e4
r3
p3
r4
p4
r5
r6
r7
Figure 7.13
An M:N relationship,
WORKS_ON.
world rule is that an employee can work on several projects and a project can have
several employees.
Cardinality ratios for binary relationships are represented on ER diagrams by dis-
playing 1, M, and N on the diamonds as shown in Figure 7.2. Notice that in this
notation, we can either specify no maximum (N) or a maximum of one (1) on par-
ticipation. An alternative notation (see Section 7.7.4) allows the designer to specify
a specific maximum number on participation, such as 4 or 5.
Participation Constraints and Existence Dependencies. The participation
constraint specifies whether the existence of an entity depends on its being related
to another entity via the relationship type. This constraint specifies the minimum
number of relationship instances that each entity can participate in, and is some-
times called the minimum cardinality constraint. There are two types of participa-
tion constraints—total and partial—that we illustrate by example. If a company
policy states that every employee must work for a department, then an employee
entity can exist only if it participates in at least one WORKS_FOR relationship
instance (Figure 7.9). Thus, the participation of EMPLOYEE in WORKS_FOR is
called total participation, meaning that every entity in the total set of employee
entities must be related to a department entity via WORKS_FOR. Total participation
is also called existence dependency. In Figure 7.12 we do not expect every employee
to manage a department, so the participation of EMPLOYEE in the MANAGES rela-
tionship type is partial, meaning that some or part of the set of employee entities are
related to some department entity via MANAGES, but not necessarily all. We will

218
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
refer to the cardinality ratio and participation constraints, taken together, as the
structural constraints of a relationship type.
In ER diagrams, total participation (or existence dependency) is displayed as a
double line connecting the participating entity type to the relationship, whereas par-
tial participation is represented by a single line (see Figure 7.2). Notice that in this
notation, we can either specify no minimum (partial participation) or a minimum
of one (total participation). The alternative notation (see Section 7.7.4) allows the
designer to specify a specific minimum number on participation in the relationship,
such as 4 or 5.
We will discuss constraints on higher-degree relationships in Section 7.9.
7.4.4
Attributes of Relationship Types
Relationship types can also have attributes, similar to those of entity types. For
example, to record the number of hours per week that an employee works on a par-
ticular project, we can include an attribute Hours for the WORKS_ON relationship
type in Figure 7.13. Another example is to include the date on which a manager
started managing a department via an attribute Start_date for the MANAGES rela-
tionship type in Figure 7.12.
Notice that attributes of 1:1 or 1:N relationship types can be migrated to one of the
participating entity types. For example, the Start_date attribute for the MANAGES
relationship can be an attribute of either EMPLOYEE or DEPARTMENT, although
conceptually it belongs to MANAGES. This is because MANAGES is a 1:1 relation-
ship, so every department or employee entity participates in at most one relationship
instance. Hence, the value of the Start_date attribute can be determined separately,
either by the participating department entity or by the participating employee
(manager) entity.
For a 1:N relationship type, a relationship attribute can be migrated only to the
entity type on the N-side of the relationship. For example, in Figure 7.9, if the
WORKS_FOR relationship also has an attribute Start_date that indicates when an
employee started working for a department, this attribute can be included as an
attribute of EMPLOYEE. This is because each employee works for only one depart-
ment, and hence participates in at most one relationship instance in WORKS_FOR.
In both 1:1 and 1:N relationship types, the decision where to place a relationship
attribute—as a relationship type attribute or as an attribute of a participating entity
type—is determined subjectively by the schema designer.
For M:N relationship types, some attributes may be determined by the combination
of participating entities in a relationship instance, not by any single entity. Such
attributes must be specified as relationship attributes. An example is the Hours attrib-
ute of the M:N relationship WORKS_ON (Figure 7.13); the number of hours per
week an employee currently works on a project is determined by an employee-
project combination and not separately by either entity.

7.5
Weak Entity Types
219
7.5 Weak Entity Types
Entity types that do not have key attributes of their own are called weak entity
types. In contrast, regular entity types that do have a key attribute—which include
all the examples discussed so far—are called strong entity types. Entities belonging
to a weak entity type are identified by being related to specific entities from another
entity type in combination with one of their attribute values. We call this other
entity type the identifying or owner entity type,10 and we call the relationship type
that relates a weak entity type to its owner the identifying relationship of the weak
entity type.11 A weak entity type always has a total participation constraint (existence
dependency) with respect to its identifying relationship because a weak entity can-
not be identified without an owner entity. However, not every existence dependency
results in a weak entity type. For example, a DRIVER_LICENSE entity cannot exist
unless it is related to a PERSON entity, even though it has its own key
(License_number) and hence is not a weak entity.
Consider the entity type DEPENDENT, related to EMPLOYEE, which is used to keep
track of the dependents of each employee via a 1:N relationship (Figure 7.2). In our
example, the attributes of DEPENDENT are Name (the first name of the dependent),
Birth_date, Sex, and Relationship (to the employee). Two dependents of two distinct
employees may, by chance, have the same values for Name, Birth_date, Sex, and
Relationship, but they are still distinct entities. They are identified as distinct entities
only after determining the particular employee entity to which each dependent is
related. Each employee entity is said to own the dependent entities that are related
to it.
A weak entity type normally has a partial key, which is the attribute that can
uniquely identify weak entities that are related to the same owner entity.12 In our
example, if we assume that no two dependents of the same employee ever have the
same first name, the attribute Name of DEPENDENT is the partial key. In the worst
case, a composite attribute of all the weak entity’s attributes will be the partial key.
In ER diagrams, both a weak entity type and its identifying relationship are distin-
guished by surrounding their boxes and diamonds with double lines (see Figure
7.2). The partial key attribute is underlined with a dashed or dotted line.
Weak entity types can sometimes be represented as complex (composite, multival-
ued) attributes. In the preceding example, we could specify a multivalued attribute
Dependents for EMPLOYEE, which is a composite attribute with component attrib-
utes Name, Birth_date, Sex, and Relationship. The choice of which representation to
use is made by the database designer. One criterion that may be used is to choose the
10The identifying entity type is also sometimes called the parent entity type or the dominant entity
type.
11The weak entity type is also sometimes called the child entity type or the subordinate entity type.
12The partial key is sometimes called the discriminator.

220
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
weak entity type representation if there are many attributes. If the weak entity par-
ticipates independently in relationship types other than its identifying relationship
type, then it should not be modeled as a complex attribute.
In general, any number of levels of weak entity types can be defined; an owner
entity type may itself be a weak entity type. In addition, a weak entity type may have
more than one identifying entity type and an identifying relationship type of degree
higher than two, as we illustrate in Section 7.9.
7.6 Refining the ER Design for the COMPANY
Database
We caow refine the database design in Figure 7.8 by changing the attributes that
represent relationships into relationship types. The cardinality ratio and participa-
tion constraint of each relationship type are determined from the requirements
listed in Section 7.2. If some cardinality ratio or dependency cannot be determined
from the requirements, the users must be questioned further to determine these
structural constraints.
In our example, we specify the following relationship types:
■
MANAGES, a 1:1 relationship type between EMPLOYEE and DEPARTMENT.
EMPLOYEE participation is partial. DEPARTMENT participation is not clear
from the requirements. We question the users, who say that a department
must have a manager at all times, which implies total participation.13 The
attribute Start_date is assigned to this relationship type.
■
WORKS_FOR, a
1:N relationship type between DEPARTMENT and
EMPLOYEE. Both participations are total.
■
CONTROLS, a 1:N relationship type between DEPARTMENT and PROJECT.
The participation of PROJECT is total, whereas that of DEPARTMENT is
determined to be partial, after consultation with the users indicates that
some departments may control no projects.
■
SUPERVISION, a 1:N relationship type between EMPLOYEE (in the supervi-
sor role) and EMPLOYEE (in the supervisee role). Both participations are
determined to be partial, after the users indicate that not every employee is a
supervisor and not every employee has a supervisor.
■
WORKS_ON, determined to be an M:N relationship type with attribute
Hours, after the users indicate that a project can have several employees
working on it. Both participations are determined to be total.
■
DEPENDENTS_OF, a 1:N relationship type between EMPLOYEE and
DEPENDENT, which is also the identifying relationship for the weak entity
13The rules in the miniworld that determine the constraints are sometimes called the business rules,
since they are determined by the business or organization that will utilize the database.

7.7
ER Diagrams, Naming Conventions, and Design Issues
221
type DEPENDENT. The participation of EMPLOYEE is partial, whereas that of
DEPENDENT is total.
After specifying the above six relationship types, we remove from the entity types in
Figure 7.8 all attributes that have been refined into relationships. These include
Manager and Manager_start_date from DEPARTMENT; Controlling_department from
PROJECT; Department, Supervisor, and Works_on from EMPLOYEE; and Employee
from DEPENDENT. It is important to have the least possible redundancy when we
design the conceptual schema of a database. If some redundancy is desired at the
storage level or at the user view level, it can be introduced later, as discussed in
Section 1.6.1.
7.7 ER Diagrams, Naming Conventions,
and Design Issues
7.7.1
Summary of Notation for ER Diagrams
Figures 7.9 through 7.13 illustrate examples of the participation of entity types in
relationship types by displaying their sets or extensions—the individual entity
instances in an entity set and the individual relationship instances in a relationship
set. In ER diagrams the emphasis is on representing the schemas rather than the
instances. This is more useful in database design because a database schema changes
rarely, whereas the contents of the entity sets change frequently. In addition, the
schema is obviously easier to display, because it is much smaller.
Figure 7.2 displays the COMPANY ER database schema as an ER diagram. We now
review the full ER diagram notation. Entity types such as EMPLOYEE,
DEPARTMENT, and PROJECT are shown in rectangular boxes. Relationship types
such as WORKS_FOR, MANAGES, CONTROLS, and WORKS_ON are shown in
diamond-shaped boxes attached to the participating entity types with straight lines.
Attributes are shown in ovals, and each attribute is attached by a straight line to its
entity type or relationship type. Component attributes of a composite attribute are
attached to the oval representing the composite attribute, as illustrated by the Name
attribute of EMPLOYEE. Multivalued attributes are shown in double ovals, as illus-
trated by the Locations attribute of DEPARTMENT. Key attributes have their names
underlined. Derived attributes are shown in dotted ovals, as illustrated by the
Number_of_employees attribute of DEPARTMENT.
Weak entity types are distinguished by being placed in double rectangles and by
having their identifying relationship placed in double diamonds, as illustrated by
the DEPENDENT entity type and the DEPENDENTS_OF identifying relationship
type. The partial key of the weak entity type is underlined with a dotted line.
In Figure 7.2 the cardinality ratio of each binary relationship type is specified by
attaching a 1, M, or N on each participating edge. The cardinality ratio of
DEPARTMENT:EMPLOYEE in MANAGES is 1:1, whereas it is 1:N for DEPARTMENT:
EMPLOYEE in WORKS_FOR, and M:N for WORKS_ON. The participation

222
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
constraint is specified by a single line for partial participation and by double lines
for total participation (existence dependency).
In Figure 7.2 we show the role names for the SUPERVISION relationship type
because the same EMPLOYEE entity type plays two distinct roles in that relationship.
Notice that the cardinality ratio is 1:N from supervisor to supervisee because each
employee in the role of supervisee has at most one direct supervisor, whereas an
employee in the role of supervisor can supervise zero or more employees.
Figure 7.14 summarizes the conventions for ER diagrams. It is important to note
that there are many other alternative diagrammatic notations (see Section 7.7.4 and
Appendix A).
7.7.2
Proper Naming of Schema Constructs
When designing a database schema, the choice of names for entity types, attributes,
relationship types, and (particularly) roles is not always straightforward. One
should choose names that convey, as much as possible, the meanings attached to the
different constructs in the schema. We choose to use singular names for entity types,
rather than plural ones, because the entity type name applies to each individual
entity belonging to that entity type. In our ER diagrams, we will use the convention
that entity type and relationship type names are uppercase letters, attribute names
have their initial letter capitalized, and role names are lowercase letters. We have
used this convention in Figure 7.2.
As a general practice, given a narrative description of the database requirements, the
nouns appearing in the narrative tend to give rise to entity type names, and the verbs
tend to indicate names of relationship types. Attribute names generally arise from
additional nouns that describe the nouns corresponding to entity types.
Another naming consideration involves choosing binary relationship names to
make the ER diagram of the schema readable from left to right and from top to bot-
tom. We have generally followed this guideline in Figure 7.2. To explain this naming
convention further, we have one exception to the convention in Figure 7.2—the
DEPENDENTS_OF relationship type, which reads from bottom to top. When we
describe this relationship, we can say that the DEPENDENT entities (bottom entity
type) are DEPENDENTS_OF (relationship name) an EMPLOYEE (top entity type).
To change this to read from top to bottom, we could rename the relationship type to
HAS_DEPENDENTS, which would then read as follows: An EMPLOYEE entity (top
entity type) HAS_DEPENDENTS (relationship name) of type DEPENDENT (bottom
entity type). Notice that this issue arises because each binary relationship can be
described starting from either of the two participating entity types, as discussed in
the beginning of Section 7.4.
7.7.3
Design Choices for ER Conceptual Design
It is occasionally difficult to decide whether a particular concept in the miniworld
should be modeled as an entity type, an attribute, or a relationship type. In this

7.7
ER Diagrams, Naming Conventions, and Design Issues
223
Symbol
Meaning
Figure 7.14
Summary of the notation
for ER diagrams.
Entity
Weak Entity
Relationship
Indentifying Relationship
Attribute
Key Attribute
Multivalued Attribute
Composite Attribute
Derived Attribute
E1
R
E2
Total Participation of E2 in R
1
N
E1
R
E2
Cardinality Ratio 1: N for E1:E2 in R
(min, max)
Structural Constraint (min, max)
R
E
on Participation of E in R

224
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
section, we give some brief guidelines as to which construct should be chosen in
particular situations.
In general, the schema design process should be considered an iterative refinement
process, where an initial design is created and then iteratively refined until the most
suitable design is reached. Some of the refinements that are often used include the
following:
■
A concept may be first modeled as an attribute and then refined into a rela-
tionship because it is determined that the attribute is a reference to another
entity type. It is often the case that a pair of such attributes that are inverses
of one another are refined into a binary relationship. We discussed this type
of refinement in detail in Section 7.6. It is important to note that in
our notation, once an attribute is replaced by a relationship, the attribute
itself should be removed from the entity type to avoid duplication and
redundancy.
■
Similarly, an attribute that exists in several entity types may be elevated or
promoted to an independent entity type. For example, suppose that several
entity types in a UNIVERSITY database, such as STUDENT, INSTRUCTOR, and
COURSE, each has an attribute Department in the initial design; the designer
may then choose to create an entity type DEPARTMENT with a single attrib-
ute Dept_name and relate it to the three entity types
(STUDENT,
INSTRUCTOR, and COURSE) via appropriate relationships. Other attrib-
utes/relationships of DEPARTMENT may be discovered later.
■
An inverse refinement to the previous case may be applied—for example, if
an entity type DEPARTMENT exists in the initial design with a single attribute
Dept_name and is related to only one other entity type, STUDENT. In this
case, DEPARTMENT may be reduced or demoted to an attribute of STUDENT.
■
Section 7.9 discusses choices concerning the degree of a relationship. In
Chapter 8, we discuss other refinements concerning specialization/general-
ization. Chapter 10 discusses additional top-down and bottom-up refine-
ments that are common in large-scale conceptual schema design.
7.7.4
Alternative Notations for ER Diagrams
There are many alternative diagrammatic notations for displaying ER diagrams.
Appendix A gives some of the more popular notations. In Section 7.8, we introduce
the Unified Modeling Language (UML) notation for class diagrams, which has been
proposed as a standard for conceptual object modeling.
In this section, we describe one alternative ER notation for specifying structural
constraints on relationships, which replaces the cardinality ratio (1:1, 1:N, M:N)
and single/double line notation for participation constraints. This notation involves
associating a pair of integer numbers (min, max) with each participation of an
entity type E in a relationship type R, where 0 ≤ min ≤ max and max ≥ 1. The num-
bers mean that for each entity e in E, e must participate in at least min and at most

7.7
ER Diagrams, Naming Conventions, and Design Issues
225
max relationship instances in R at any point in time. In this method, min = 0 implies partial participation,
whereas min > 0 implies total participation.
Figure 7.15 displays the COMPANY database schema using the (min, max) notation.14 Usually, one uses
either the cardinality ratio/single-line/double-line notation or the (min, max) notation. The (min, max)
Figure 7.15
Fname
Minit
Lname
ER diagrams for the company schema, with structural con-
straints specified using (min, max) notation and role names.
Bdate
Name
Address
Salary
Ssn
Sex
Locations
WORKS_FOR
(4,N)
(1,1)
Name
Number
Employee
Department
Start_date
Number_of_employees
EMPLOYEE
DEPARTMENT
(0,1)
Department
(0,N)
Controlling
Manager
Managed
(1,1)
Department
MANAGES
CONTROLS
Hours
(1,N)
Worker
Controlled
(0,N)
(0,1)
(1,1)
Project
Supervisor
Supervisee
Project
WORKS_ON
PROJECT
(1,N)
SUPERVISION
(0,N)
Employee
Name
Location
Number
DEPENDENTS_OF
(1,1)
Dependent
DEPENDENT
Name
Sex
Birth_date
Relationship
14In some notations, particularly those used in object modeling methodologies such as UML, the (min,
max) is placed on the opposite sides to the ones we have shown. For example, for the WORKS_FOR
relationship in Figure 7.15, the (1,1) would be on the DEPARTMENT side, and the (4,N) would be on the
EMPLOYEE side. Here we used the original notation from Abrial (1974).

226
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
notation is more precise, and we can use it to specify some structural constraints for
relationship types of higher degree. However, it is not sufficient for specifying some
key constraints on higher-degree relationships, as discussed in Section 7.9.
Figure 7.15 also displays all the role names for the COMPANY database schema.
7.8 Example of Other Notation:
UML Class Diagrams
The UML methodology is being used extensively in software design and has many
types of diagrams for various software design purposes. We only briefly present the
basics of UML class diagrams here, and compare them with ER diagrams. In some
ways, class diagrams can be considered as an alternative notation to ER diagrams.
Additional UML notation and concepts are presented in Section 8.6, and in Chapter
10. Figure 7.16 shows how the COMPANY ER database schema in Figure 7.15 can be
displayed using UML class diagram notation. The entity types in Figure 7.15 are
modeled as classes in Figure 7.16. An entity in ER corresponds to an object in UML.
Figure 7.16
The COMPANY conceptual schema
in UML class diagram notation.
EMPLOYEE
DEPARTMENT
Multiplicity
4..*
WORKS_FOR
1..1
Notation in OMT:
Name: Name_dom
Name
Fname
Number
1..1
1..1
0..1
Minit
0..*
add_employee
Lname
0..1
number_of_employees
Ssn
MANAGES
change_manager
0..*
Bdate: Date
Start_date
Sex: {M,F}
1..1
Address
1..*
Salary
1..*
age
change_department
CONTROLS
LOCATION
change_projects
supervisee
WORKS_ON
Name
Hours
1..1
Dependent_name
0..1
1..*
supervisor
PROJECT
0..*
Name
DEPENDENT
Number
Sex: {M,F}
add_employee
Aggregation
Birth_date: Date
add_project
Notation in UML:
Relationship
change_manager
Whole
Part

7.8
Example of Other Notation: UML Class Diagrams
227
In UML class diagrams, a class (similar to an entity type in ER) is displayed as a box
(see Figure 7.16) that includes three sections: The top section gives the class name
(similar to entity type name); the middle section includes the attributes; and the
last section includes operations that can be applied to individual objects (similar to
individual entities in an entity set) of the class. Operations are not specified in ER
diagrams. Consider the EMPLOYEE class in Figure 7.16. Its attributes are Name, Ssn,
Bdate, Sex, Address, and Salary. The designer can optionally specify the domain of
an attribute if desired, by placing a colon (:) followed by the domaiame or
description, as illustrated by the Name, Sex, and Bdate attributes of EMPLOYEE in
Figure 7.16. A composite attribute is modeled as a structured domain, as illustrated
by the Name attribute of EMPLOYEE. A multivalued attribute will generally be mod-
eled as a separate class, as illustrated by the LOCATION class in Figure 7.16.
Relationship types are called associations in UML terminology, and relationship
instances are called links. A binary association (binary relationship type) is repre-
sented as a line connecting the participating classes (entity types), and may option-
ally have a name. A relationship attribute, called a link attribute, is placed in a box
that is connected to the association’s line by a dashed line. The (min, max) notation
described in Section 7.7.4 is used to specify relationship constraints, which are
called multiplicities in UML terminology. Multiplicities are specified in the form
min..max, and an asterisk (*) indicates no maximum limit on participation.
However, the multiplicities are placed on the opposite ends of the relationship when
compared with the notation discussed in Section 7.7.4 (compare Figures 7.15 and
7.16). In UML, a single asterisk indicates a multiplicity of 0..*, and a single 1 indi-
cates a multiplicity of 1..1. A recursive relationship (see Section 7.4.2) is called a
reflexive association in UML, and the role names—like the multiplicities—are
placed at the opposite ends of an association when compared with the placing of
role names in Figure 7.15.
In UML, there are two types of relationships: association and aggregation.
Aggregation is meant to represent a relationship between a whole object and its
component parts, and it has a distinct diagrammatic notation. In Figure 7.16, we
modeled the locations of a department and the single location of a project as aggre-
gations. However, aggregation and association do not have different structural
properties, and the choice as to which type of relationship to use is somewhat sub-
jective. In the ER model, both are represented as relationships.
UML also distinguishes between unidirectional and bidirectional associations (or
aggregations). In the unidirectional case, the line connecting the classes is displayed
with an arrow to indicate that only one direction for accessing related objects is
needed. If no arrow is displayed, the bidirectional case is assumed, which is the
default. For example, if we always expect to access the manager of a department
starting from a DEPARTMENT object, we would draw the association line represent-
ing the MANAGES association with an arrow from DEPARTMENT to EMPLOYEE. In
addition, relationship instances may be specified to be ordered. For example, we
could specify that the employee objects related to each department through the
WORKS_FOR association (relationship) should be ordered by their Salary attribute

228
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
value. Association (relationship) names are optional in UML, and relationship
attributes are displayed in a box attached with a dashed line to the line representing
the association/aggregation (see Start_date and Hours in Figure 7.16).
The operations given in each class are derived from the functional requirements of
the application, as we discussed in Section 7.1. It is generally sufficient to specify the
operatioames initially for the logical operations that are expected to be applied
to individual objects of a class, as shown in Figure 7.16. As the design is refined,
more details are added, such as the exact argument types (parameters) for each
operation, plus a functional description of each operation. UML has function
descriptions and sequence diagrams to specify some of the operation details, but
these are beyond the scope of our discussion. Chapter 10 will introduce some of
these diagrams.
Weak entities can be modeled using the construct called qualified association (or
qualified aggregation) in UML; this can represent both the identifying relationship
and the partial key, which is placed in a box attached to the owner class. This is illus-
trated by the DEPENDENT class and its qualified aggregation to EMPLOYEE in
Figure 7.16. The partial key Dependent_name is called the discriminator in UML ter-
minology, since its value distinguishes the objects associated with (related to) the
same EMPLOYEE. Qualified associations are not restricted to modeling weak enti-
ties, and they can be used to model other situations in UML.
This section is not meant to be a complete description of UML class diagrams, but
rather to illustrate one popular type of alternative diagrammatic notation that can
be used for representing ER modeling concepts.
7.9 Relationship Types of Degree
Higher than Two
In Section 7.4.2 we defined the degree of a relationship type as the number of par-
ticipating entity types and called a relationship type of degree two binary and a rela-
tionship type of degree three ternary. In this section, we elaborate on the differences
between binary and higher-degree relationships, when to choose higher-degree ver-
sus binary relationships, and how to specify constraints on higher-degree relation-
ships.
7.9.1
Choosing between Binary and Ternary
(or Higher-Degree) Relationships
The ER diagram notation for a ternary relationship type is shown in Figure 7.17(a),
which displays the schema for the SUPPLY relationship type that was displayed at
the entity set/relationship set or instance level in Figure 7.10. Recall that the rela-
tionship set of SUPPLY is a set of relationship instances (s, j, p), where s is a
SUPPLIER who is currently supplying a PART p to a PROJECT j. In general, a rela-
tionship type R of degree n will have n edges in an ER diagram, one connecting R to
each participating entity type.

7.9
Relationship Types of Degree Higher than Two
229
Sname
Quantity
Proj_name
(a)
SUPPLIER
SUPPLY
PROJECT
Part_no
PART
Sname
Proj_name
M
N
(b)
SUPPLIER
SUPPLIES
PROJECT
M
M
CAN_SUPPLY
USES
Part_no
N
N
PART
(c)
Sname
Quantity
Proj_name
1
N
N
1
SUPPLIER
SS
SUPPLY
SPJ
PROJECT
N
Figure 7.17
SP
Ternary relationship types. (a) The SUPPLY
Part_no
relationship. (b) Three binary relationships
1
not equivalent to SUPPLY. (c) SUPPLY
PART
represented as a weak entity type.
Figure
7.17(b) shows an ER diagram for three binary relationship types
CAN_SUPPLY, USES, and SUPPLIES. In general, a ternary relationship type repre-
sents different information than do three binary relationship types. Consider the
three binary relationship types CAN_SUPPLY, USES, and SUPPLIES. Suppose that
CAN_SUPPLY, between SUPPLIER and PART, includes an instance (s, p) whenever
supplier s can supply part p (to any project); USES, between PROJECT and PART,
includes an instance ( j, p) whenever project j uses part p; and SUPPLIES, between
SUPPLIER and PROJECT, includes an instance (s, j) whenever supplier s supplies

230
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
some part to project j. The existence of three relationship instances (s, p), ( j, p), and
(s, j) in CAN_SUPPLY, USES, and SUPPLIES, respectively, does not necessarily imply
that an instance (s, j, p) exists in the ternary relationship SUPPLY, because the
meaning is different. It is often tricky to decide whether a particular relationship
should be represented as a relationship type of degree n or should be broken down
into several relationship types of smaller degrees. The designer must base this
decision on the semantics or meaning of the particular situation being represented.
The typical solution is to include the ternary relationship plus one or more of the
binary relationships, if they represent different meanings and if all are needed by the
application.
Some database design tools are based on variations of the ER model that permit
only binary relationships. In this case, a ternary relationship such as SUPPLY must
be represented as a weak entity type, with no partial key and with three identifying
relationships. The three participating entity types SUPPLIER, PART, and PROJECT
are together the owner entity types (see Figure 7.17(c)). Hence, an entity in the weak
entity type SUPPLY in Figure 7.17(c) is identified by the combination of its three
owner entities from SUPPLIER, PART, and PROJECT.
It is also possible to represent the ternary relationship as a regular entity type by
introducing an artificial or surrogate key. In this example, a key attribute Supply_id
could be used for the supply entity type, converting it into a regular entity type.
Three binary N:1 relationships relate SUPPLY to the three participating entity types.
Another example is shown in Figure 7.18. The ternary relationship type OFFERS
represents information on instructors offering courses during particular semesters;
hence it includes a relationship instance (i, s, c) whenever INSTRUCTOR i offers
COURSE c during SEMESTER s. The three binary relationship types shown in
Figure 7.18 have the following meanings: CAN_TEACH relates a course to the
instructors who can teach that course, TAUGHT_DURING relates a semester to the
instructors who taught some course during that semester, and OFFERED_DURING
Figure 7.18
Semester
Year
Another example of ternary versus
binary relationship types.
TAUGHT_DURING
Lname
Sem_year
INSTRUCTOR
OFFERS
SEMESTER
CAN_TEACH
OFFERED_DURING
Cnumber
COURSE

7.9
Relationship Types of Degree Higher than Two
231
relates a semester to the courses offered during that semester by any instructor.
These ternary and binary relationships represent different information, but certain
constraints should hold among the relationships. For example, a relationship
instance (i, s, c) should not exist in OFFERS unless an instance (i, s) exists in
TAUGHT_DURING, an instance (s, c) exists in OFFERED_DURING, and an instance (i,
c) exists in CAN_TEACH. However, the reverse is not always true; we may have
instances (i, s), (s, c), and (i, c) in the three binary relationship types with no corre-
sponding instance (i, s, c) in OFFERS. Note that in this example, based on the mean-
ings of the relationships, we can infer the instances of TAUGHT_DURING and
OFFERED_DURING from the instances in OFFERS, but we cannot infer the
instances of CAN_TEACH; therefore, TAUGHT_DURING and OFFERED_DURING are
redundant and can be left out.
Although in general three binary relationships cannot replace a ternary relationship,
they may do so under certain additional constraints. In our example, if the
CAN_TEACH relationship is 1:1 (an instructor can teach one course, and a course
can be taught by only one instructor), then the ternary relationship OFFERS can be
left out because it can be inferred from the three binary relationships CAN_TEACH,
TAUGHT_DURING, and OFFERED_DURING. The schema designer must analyze the
meaning of each specific situation to decide which of the binary and ternary rela-
tionship types are needed.
Notice that it is possible to have a weak entity type with a ternary (or n-ary) identi-
fying relationship type. In this case, the weak entity type can have several owner
entity types. An example is shown in Figure 7.19. This example shows part of a data-
base that keeps track of candidates interviewing for jobs at various companies, and
may be part of an employment agency database, for example. In the requirements, a
candidate can have multiple interviews with the same company (for example, with
different company departments or on separate dates), but a job offer is made based
on one of the interviews. Here, INTERVIEW is represented as a weak entity with two
owners CANDIDATE and COMPANY, and with the partial key Dept_date. An
INTERVIEW entity is uniquely identified by a candidate, a company, and the combi-
nation of the date and department of the interview.
Figure 7.19
Name
Cname
A weak entity type INTERVIEW
with a ternary identifying rela-
CANDIDATE
CCI
COMPANY
tionship type.
Department
Date
Dept_date
INTERVIEW
RESULTS_IN
JOB_OFFER

232
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
7.9.2
Constraints on Ternary (or Higher-Degree)
Relationships
There are two notations for specifying structural constraints on n-ary relationships,
and they specify different constraints. They should thus both be used if it is impor-
tant to fully specify the structural constraints on a ternary or higher-degree rela-
tionship. The first notation is based on the cardinality ratio notation of binary
relationships displayed in Figure 7.2. Here, a 1, M, or N is specified on each partici-
pation arc (both M and N symbols stand for many or any number).15 Let us illus-
trate this constraint using the SUPPLY relationship in Figure 7.17.
Recall that the relationship set of SUPPLY is a set of relationship instances (s, j, p),
where s is a SUPPLIER, j is a PROJECT, and p is a PART. Suppose that the constraint
exists that for a particular project-part combination, only one supplier will be used
(only one supplier supplies a particular part to a particular project). In this case, we
place 1 on the SUPPLIER participation, and M, N on the PROJECT, PART participa-
tions in Figure 7.17. This specifies the constraint that a particular ( j, p) combination
can appear at most once in the relationship set because each such (PROJECT, PART)
combination uniquely determines a single supplier. Hence, any relationship
instance (s, j, p) is uniquely identified in the relationship set by its ( j, p) combina-
tion, which makes ( j, p) a key for the relationship set. In this notation, the participa-
tions that have a 1 specified on them are not required to be part of the identifying
key for the relationship set.16 If all three cardinalities are M or N, then the key will
be the combination of all three participants.
The second notation is based on the (min, max) notation displayed in Figure 7.15
for binary relationships. A (min, max) on a participation here specifies that each
entity is related to at least min and at most max relationship instances in the relation-
ship set. These constraints have no bearing on determining the key of an n-ary rela-
tionship, where n > 2,17 but specify a different type of constraint that places
restrictions on how many relationship instances each entity can participate in.
7.10 Summary
In this chapter we presented the modeling concepts of a high-level conceptual data
model, the Entity-Relationship (ER) model. We started by discussing the role that a
high-level data model plays in the database design process, and then we presented a
sample set of database requirements for the COMPANY database, which is one of the
examples that is used throughout this book. We defined the basic ER model con-
cepts of entities and their attributes. Then we discussed NULL values and presented
15This notation allows us to determine the key of the relationship relation, as we discuss in Chapter 9.
16This is also true for cardinality ratios of binary relationships.
17The (min, max) constraints can determine the keys for binary relationships, though.

7.10 Summary
233
the various types of attributes, which can be nested arbitrarily to produce complex
attributes:
■ Simple or atomic
■ Composite
■ Multivalued
We also briefly discussed stored versus derived attributes. Then we discussed the ER
model concepts at the schema or “intension” level:
■ Entity types and their corresponding entity sets
■ Key attributes of entity types
■ Value sets (domains) of attributes
■ Relationship types and their corresponding relationship sets
■ Participation roles of entity types in relationship types
We presented two methods for specifying the structural constraints on relationship
types. The first method distinguished two types of structural constraints:
■ Cardinality ratios (1:1, 1:N, M:N for binary relationships)
■ Participation constraints (total, partial)
We noted that, alternatively, another method of specifying structural constraints is
to specify minimum and maximum numbers (min, max) on the participation of
each entity type in a relationship type. We discussed weak entity types and the
related concepts of owner entity types, identifying relationship types, and partial
key attributes.
Entity-Relationship schemas can be represented diagrammatically as ER diagrams.
We showed how to design an ER schema for the COMPANY database by first defin-
ing the entity types and their attributes and then refining the design to include rela-
tionship types. We displayed the ER diagram for the COMPANY database schema.
We discussed some of the basic concepts of UML class diagrams and how they relate
to ER modeling concepts. We also described ternary and higher-degree relationship
types in more detail, and discussed the circumstances under which they are distin-
guished from binary relationships.
The ER modeling concepts we have presented thus far—entity types, relationship
types, attributes, keys, and structural constraints—can model many database appli-
cations. However, more complex applications—such as engineering design, medical
information systems, and telecommunications—require additional concepts if we
want to model them with greater accuracy. We discuss some advanced modeling
concepts in Chapter 8 and revisit further advanced data modeling techniques in
Chapter 26.

234
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
Review Questions
7.1.
Discuss the role of a high-level data model in the database design process.
7.2.
List the various cases where use of a NULL value would be appropriate.
7.3.
Define the following terms: entity, attribute, attribute value, relationship
instance, composite attribute, multivalued attribute, derived attribute, complex
attribute, key attribute, and value set (domain).
7.4.
What is an entity type? What is an entity set? Explain the differences among
an entity, an entity type, and an entity set.
7.5.
Explain the difference between an attribute and a value set.
7.6.
What is a relationship type? Explain the differences among a relationship
instance, a relationship type, and a relationship set.
7.7.
What is a participation role? When is it necessary to use role names in the
description of relationship types?
7.8.
Describe the two alternatives for specifying structural constraints on rela-
tionship types. What are the advantages and disadvantages of each?
7.9.
Under what conditions can an attribute of a binary relationship type be
migrated to become an attribute of one of the participating entity types?
7.10.
When we think of relationships as attributes, what are the value sets of these
attributes? What class of data models is based on this concept?
7.11.
What is meant by a recursive relationship type? Give some examples of
recursive relationship types.
7.12.
When is the concept of a weak entity used in data modeling? Define the
terms owner entity type, weak entity type, identifying relationship type, and
partial key.
7.13.
Can an identifying relationship of a weak entity type be of a degree greater
than two? Give examples to illustrate your answer.
7.14.
Discuss the conventions for displaying an ER schema as an ER diagram.
7.15.
Discuss the naming conventions used for ER schema diagrams.
Exercises
7.16. Consider the following set of requirements for a UNIVERSITY database that is
used to keep track of students’ transcripts. This is similar but not identical to
the database shown in Figure 1.2:
a. The university keeps track of each student’s name, student number, Social
Security number, current address and phone number, permanent address
and phone number, birth date, sex, class (freshman, sophomore, …, grad-
uate), major department, minor department (if any), and degree program

Exercises
235
(B.A., B.S., …, Ph.D.). Some user applications need to refer to the city,
state, and ZIP Code of the student’s permanent address and to the stu-
dent’s last name. Both Social Security number and student number have
unique values for each student.
b. Each department is described by a name, department code, office num-
ber, office phone number, and college. Both name and code have unique
values for each department.
c. Each course has a course name, description, course number, number of
semester hours, level, and offering department. The value of the course
number is unique for each course.
d. Each section has an instructor, semester, year, course, and sectioum-
ber. The sectioumber distinguishes sections of the same course that are
taught during the same semester/year; its values are 1, 2, 3, …, up to the
number of sections taught during each semester.
e. A grade report has a student, section, letter grade, and numeric grade (0,
1, 2, 3, or 4).
Design an ER schema for this application, and draw an ER diagram for the
schema. Specify key attributes of each entity type, and structural constraints
on each relationship type. Note any unspecified requirements, and make
appropriate assumptions to make the specification complete.
7.17.
Composite and multivalued attributes can be nested to any number of levels.
Suppose we want to design an attribute for a STUDENT entity type to keep
track of previous college education. Such an attribute will have one entry for
each college previously attended, and each such entry will be composed of
college name, start and end dates, degree entries (degrees awarded at that
college, if any), and transcript entries (courses completed at that college, if
any). Each degree entry contains the degree name and the month and year
the degree was awarded, and each transcript entry contains a course name,
semester, year, and grade. Design an attribute to hold this information. Use
the conventions in Figure 7.5.
7.18.
Show an alternative design for the attribute described in Exercise 7.17 that
uses only entity types (including weak entity types, if needed) and relation-
ship types.
7.19.
Consider the ER diagram in Figure 7.20, which shows a simplified schema
for an airline reservations system. Extract from the ER diagram the require-
ments and constraints that produced this schema. Try to be as precise as pos-
sible in your requirements and constraints specification.
7.20.
In Chapters 1 and 2, we discussed the database environment and database
users. We can consider many entity types to describe such an environment,
such as DBMS, stored database, DBA, and catalog/data dictionary. Try to
specify all the entity types that can fully describe a database system and its
environment; then specify the relationship types among them, and draw an
ER diagram to describe such a general database environment.

236
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
Figure 7.20
An ER diagram for an AIRLINE database schema.
Airport_code
City
State
Name
DEPARTURE_
1
AIRPORT
N
Le g_no
AIRPORT
Scheduled_dep_time
Scheduled_arr_time
FLIGHT_LEG
N
M
1
N
ARRIVAL_
CAN_
AIRPORT
Instances
LEGS
LAND
1
N
Number
1
Type_name
Max_seats
INSTANCE_OF
Airline
FLIGHT
Company
1
N
AIRPLANE_
Arr_time
Weekdays
1
TYPE
DEPARTS
1
Restrictions
FARES
1
N
ARRIVES
Dep_time
Amount
N
TYPE
N
Code
FARE
N
Airplane_id
Total_no_of_seats
No_of_avail_seats
1
N
Date
AIRPLANE
ASSIGNED
LEG_INSTANCE
Customer_name
Cphone
Seat_no
RESERVATION
Notes:
N
1
A LEG (segment) is a nonstop portion of a flight
SEAT
A LEG_INSTANCE is a particular occurrence
of a LEG on a particular date.
7.21. Design an ER schema for keeping track of information about votes taken in
the U.S. House of Representatives during the current two-year congressional
session. The database needs to keep track of each U.S. STATE’s Name (e.g.,
‘Texas’, ‘New York’, ‘California’) and include the Region of the state (whose
domain is {‘Northeast’, ‘Midwest’, ‘Southeast’, ‘Southwest’, ‘West’}). Each

Exercises
237
CONGRESS_PERSON in the House of Representatives is described by his or
her Name, plus the District represented, the Start_date when the congressper-
son was first elected, and the political Party to which he or she belongs
(whose domain is {‘Republican’, ‘Democrat’, ‘Independent’, ‘Other’}). The
database keeps track of each BILL (i.e., proposed law), including the
Bill_name, the Date_of_vote on the bill, whether the bill Passed_or_failed
(whose domain is {‘Yes’, ‘No’}), and the Sponsor (the congressperson(s) who
sponsored—that is, proposed—the bill). The database also keeps track of
how each congressperson voted on each bill (domain of Vote attribute is
{‘Yes’, ‘No’, ‘Abstain’, ‘Absent’}). Draw an ER schema diagram for this applica-
tion. State clearly any assumptions you make.
7.22.
A database is being constructed to keep track of the teams and games of a
sports league. A team has a number of players, not all of whom participate in
each game. It is desired to keep track of the players participating in each
game for each team, the positions they played in that game, and the result of
the game. Design an ER schema diagram for this application, stating any
assumptions you make. Choose your favorite sport (e.g., soccer, baseball,
football).
7.23.
Consider the ER diagram shown in Figure 7.21 for part of a BANK database.
Each bank can have multiple branches, and each branch can have multiple
accounts and loans.
a. List the strong (nonweak) entity types in the ER diagram.
b. Is there a weak entity type? If so, give its name, partial key, and identifying
relationship.
c. What constraints do the partial key and the identifying relationship of the
weak entity type specify in this diagram?
d. List the names of all relationship types, and specify the (min, max) con-
straint on each participation of an entity type in a relationship type.
Justify your choices.
e. List concisely the user requirements that led to this ER schema design.
f. Suppose that every customer must have at least one account but is
restricted to at most two loans at a time, and that a bank branch cannot
have more than 1,000 loans. How does this show up on the (min, max)
constraints?
7.24.
Consider the ER diagram in Figure 7.22. Assume that an employee may work
in up to two departments or may not be assigned to any department. Assume
that each department must have one and may have up to three phone num-
bers. Supply (min, max) constraints on this diagram. State clearly any addi-
tional assumptions you make. Under what conditions would the relationship
HAS_PHONE be redundant in this example?
7.25.
Consider the ER diagram in Figure 7.23. Assume that a course may or may
not use a textbook, but that a text by definition is a book that is used in some
course. A course may not use more than five books. Instructors teach from

238
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
1
N
BANK_BRANCH
BANK
BRANCHES
Code
Name
Addr
Addr
Branch_no
1
1
ACCTS
LOANS
N
N
Acct_no
Balance
Loan_no
Amount
ACCOUNT
Type
LOAN
Type
M
M
A_C
L_C
N
N
Ssn
Name
Figure 7.21
An ER diagram for a BANK
Phone
CUSTOMER
Addr
database schema.
EMPLOYEE
WORKS_IN
DEPARTMENT
HAS_PHONE
CONTAINS
Figure 7.22
Part of an ER diagram
for a COMPANY data-
PHONE
base.

Exercises
239
INSTRUCTOR
TEACHES
COURSE
USES
Figure 7.23
Part of an ER diagram
for a COURSES data-
TEXT
base.
two to four courses. Supply (min, max) constraints on this diagram. State
clearly any additional assumptions you make. If we add the relationship
ADOPTS, to indicate the textbook(s) that an instructor uses for a course,
should it be a binary relationship between INSTRUCTOR and TEXT, or a ter-
nary relationship between all three entity types? What (min, max) con-
straints would you put on it? Why?
7.26.
Consider an entity type SECTION in a UNIVERSITY database, which describes
the section offerings of courses. The attributes of SECTION are
Section_number, Semester, Year, Course_number, Instructor, Room_no (where
section is taught), Building (where section is taught), Weekdays (domain is
the possible combinations of weekdays in which a section can be offered
{‘MWF’, ‘MW’, ‘TT’, and so on}), and Hours (domain is all possible time peri-
ods during which sections are offered {‘9-9:50 A.M.’, ‘10-10:50 A.M.’, …,
‘3:30-4:50 P.M.’, ‘5:30-6:20 P.M.’, and so on}). Assume that Section_number is
unique for each course within a particular semester/year combination (that
is, if a course is offered multiple times during a particular semester, its sec-
tion offerings are numbered 1, 2, 3, and so on). There are several composite
keys for section, and some attributes are components of more than one key.
Identify three composite keys, and show how they can be represented in an
ER schema diagram.
7.27.
Cardinality ratios often dictate the detailed design of a database. The cardi-
nality ratio depends on the real-world meaning of the entity types involved
and is defined by the specific application. For the following binary relation-
ships, suggest cardinality ratios based on the common-sense meaning of the
entity types. Clearly state any assumptions you make.
Entity 1
Cardinality Ratio
Entity 2
1. STUDENT
______________
SOCIAL_SECURITY_CARD
2. STUDENT
______________
TEACHER
3. CLASSROOM
______________
WALL

240
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
4. COUNTRY
______________
CURRENT_PRESIDENT
5. COURSE
______________
TEXTBOOK
6. ITEM (that can
be found in an
order)
______________
ORDER
7. STUDENT
______________
CLASS
8. CLASS
______________
INSTRUCTOR
9. INSTRUCTOR
______________
OFFICE
10. EBAY_AUCTION
_ITEM
______________
EBAY_BID
7.28. Consider the ER schema for the MOVIES database in Figure 7.24.
Assume that MOVIES is a populated database. ACTOR is used as a generic
term and includes actresses. Given the constraints shown in the ER schema,
respond to the following statements with True, False, or Maybe. Assign a
response of Maybe to statements that, while not explicitly shown to be True,
cannot be proven False based on the schema as shown. Justify each answer.
Figure 7.24
An ER diagram for a MOVIES
database schema.
M
N
PERFORMS_IN
MOVIE
ACTOR
LEAD_ROLE
1
2
N
1
ALSO_A_
DIRECTOR
ACTOR_
N
PRODUCER
1
1
DIRECTOR
DIRECTS
1
N
M
PRODUCER
PRODUCES

Laboratory Exercises
241
a. There are no actors in this database that have been io movies.
b. There are some actors who have acted in more than ten movies.
c. Some actors have done a lead role in multiple movies.
d. A movie can have only a maximum of two lead actors.
e. Every director has been an actor in some movie.
f. No producer has ever been an actor.
g. A producer cannot be an actor in some other movie.
h. There are movies with more than a dozen actors.
i. Some producers have been a director as well.
j. Most movies have one director and one producer.
k. Some movies have one director but several producers.
l. There are some actors who have done a lead role, directed a movie, and
produced some movie.
m. No movie has a director who also acted in that movie.
7.29.
Given the ER schema for the MOVIES database in Figure 7.24, draw an
instance diagram using three movies that have been released recently. Draw
instances of each entity type: MOVIES, ACTORS, PRODUCERS, DIRECTORS
involved; make up instances of the relationships as they exist in reality for
those movies.
7.30.
Illustrate the UML Diagram for Exercise 7.16. Your UML design should
observe the following requirements:
a. A student should have the ability to compute his/her GPA and add or
drop majors and minors.
b. Each department should be to able add or delete courses and hire or ter-
minate faculty.
c. Each instructor should be able to assign or change a student’s grade for a
course.
Note: Some of these functions may be spread over multiple classes.
Laboratory Exercises
7.31. Consider the UNIVERSITY database described in Exercise 7.16. Build the ER
schema for this database using a data modeling tool such as ERwin or
Rational Rose.
7.32. Consider a MAIL_ORDER database in which employees take orders for parts
from customers. The data requirements are summarized as follows:
■ The mail order company has employees, each identified by a unique
employee number, first and last name, and Zip Code.
■ Each customer of the company is identified by a unique customer num-
ber, first and last name, and Zip Code.

242
Chapter 7 Data Modeling Using the Entity-Relationship (ER) Model
■ Each part sold by the company is identified by a unique part number, a
part name, price, and quantity in stock.
■ Each order placed by a customer is taken by an employee and is given a
unique order number. Each order contains specified quantities of one or
more parts. Each order has a date of receipt as well as an expected ship
date. The actual ship date is also recorded.
Design an Entity-Relationship diagram for the mail order database and
build the design using a data modeling tool such as ERwin or Rational Rose.
7.33.
Consider a MOVIE database in which data is recorded about the movie indus-
try. The data requirements are summarized as follows:
■ Each movie is identified by title and year of release. Each movie has a
length in minutes. Each has a production company, and each is classified
under one or more genres (such as horror, action, drama, and so forth).
Each movie has one or more directors and one or more actors appear in
it. Each movie also has a plot outline. Finally, each movie has zero or more
quotable quotes, each of which is spoken by a particular actor appearing
in the movie.
■ Actors are identified by name and date of birth and appear in one or more
movies. Each actor has a role in the movie.
■ Directors are also identified by name and date of birth and direct one or
more movies. It is possible for a director to act in a movie (including one
that he or she may also direct).
■ Production companies are identified by name and each has an address. A
production company produces one or more movies.
Design an Entity-Relationship diagram for the movie database and enter the
design using a data modeling tool such as ERwin or Rational Rose.
7.34.
Consider a CONFERENCE_REVIEW database in which researchers submit
their research papers for consideration. Reviews by reviewers are recorded
for use in the paper selection process. The database system caters primarily
to reviewers who record answers to evaluation questions for each paper they
review and make recommendations regarding whether to accept or reject the
paper. The data requirements are summarized as follows:
■ Authors of papers are uniquely identified by e-mail id. First and last
names are also recorded.
■ Each paper is assigned a unique identifier by the system and is described
by a title, abstract, and the name of the electronic file containing the
paper.
■ A paper may have multiple authors, but one of the authors is designated
as the contact author.
■ Reviewers of papers are uniquely identified by e-mail address. Each
reviewer’s first name, last name, phone number, affiliation, and topics of
interest are also recorded.

Selected Bibliography
243
■ Each paper is assigned between two and four reviewers. A reviewer rates
each paper assigned to him or her on a scale of 1 to 10 in four categories:
technical merit, readability, originality, and relevance to the conference.
Finally, each reviewer provides an overall recommendation regarding
each paper.
■ Each review contains two types of written comments: one to be seen by
the review committee only and the other as feedback to the author(s).
Design an Entity-Relationship diagram for the CONFERENCE_REVIEW
database and build the design using a data modeling tool such as ERwin or
Rational Rose.
7.35. Consider the ER diagram for the AIRLINE database shown in Figure 7.20.
Build this design using a data modeling tool such as ERwin or Rational Rose.
Selected Bibliography
The Entity-Relationship model was introduced by Chen (1976), and related work
appears in Schmidt and Swenson (1975), Wiederhold and Elmasri (1979), and
Senko (1975). Since then, numerous modifications to the ER model have been sug-
gested. We have incorporated some of these in our presentation. Structural con-
straints on relationships are discussed in Abrial (1974), Elmasri and Wiederhold
(1980), and Lenzerini and Santucci (1983). Multivalued and composite attributes
are incorporated in the ER model in Elmasri et al. (1985). Although we did not dis-
cuss languages for the ER model and its extensions, there have been several propos-
als for such languages. Elmasri and Wiederhold (1981) proposed the GORDAS
query language for the ER model. Another ER query language was proposed by
Markowitz and Raz (1983). Senko (1980) presented a query language for Senko’s
DIAM model. A formal set of operations called the ER algebra was presented by
Parent and Spaccapietra (1985). Gogolla and Hohenstein (1991) presented another
formal language for the ER model. Campbell et al. (1985) presented a set of ER
operations and showed that they are relationally complete. A conference for the dis-
semination of research results related to the ER model has been held regularly since
1979. The conference, now known as the International Conference on Conceptual
Modeling, has been held in Los Angeles (ER 1979, ER 1983, ER 1997), Washington,
D.C. (ER 1981), Chicago (ER 1985), Dijon, France (ER 1986), New York City (ER
1987), Rome (ER 1988), Toronto (ER 1989), Lausanne, Switzerland (ER 1990), San
Mateo, California (ER 1991), Karlsruhe, Germany (ER 1992), Arlington, Texas (ER
1993), Manchester, England (ER 1994), Brisbane, Australia (ER 1995), Cottbus,
Germany (ER 1996), Singapore (ER 1998), Paris, France (ER 1999), Salt Lake City,
Utah (ER 2000), Yokohama, Japan (ER 2001), Tampere, Finland (ER 2002),
Chicago, Illinois (ER 2003), Shanghai, China (ER 2004), Klagenfurt, Austria (ER
2005), Tucson, Arizona (ER 2006), Auckland, New Zealand (ER 2007), Barcelona,
Catalonia, Spain (ER 2008), and Gramado, RS, Brazil (ER 2009). The 2010 confer-
ence is to be held in Vancouver, BC, Canada.

This page intentionally left blank

chapter
8
The Enhanced Entity-Relationship
(EER) Model
he ER modeling concepts discussed in Chapter 7
T
are sufficient for representing many database
schemas for traditional database applications, which include many data-processing
applications in business and industry. Since the late 1970s, however, designers of
database applications have tried to design more accurate database schemas that
reflect the data properties and constraints more precisely. This was particularly
important for newer applications of database technology, such as databases for
engineering design and manufacturing (CAD/CAM),1 telecommunications, com-
plex software systems, and Geographic Information Systems (GIS), among many
other applications. These types of databases have more complex requirements than
do the more traditional applications. This led to the development of additional
semantic data modeling concepts that were incorporated into conceptual data mod-
els such as the ER model. Various semantic data models have been proposed in the
literature. Many of these concepts were also developed independently in related
areas of computer science, such as the knowledge representation area of artificial
intelligence and the object modeling area in software engineering.
In this chapter, we describe features that have been proposed for semantic data mod-
els, and show how the ER model can be enhanced to include these concepts, leading
to the Enhanced ER (EER) model.2 We start in Section 8.1 by incorporating the con-
cepts of class/subclass relationships and type inheritance into the ER model. Then, in
Section 8.2, we add the concepts of specialization and generalization. Section 8.3
1CAD/CAM stands for computer-aided design/computer-aided manufacturing.
2EER has also been used to stand for Extended ER model.
245

246
Chapter 8 The Enhanced Entity-Relationship (EER) Model
discusses the various types of constraints on specialization/generalization, and
Section 8.4 shows how the UNION construct can be modeled by including the con-
cept of category in the EER model. Section 8.5 gives a sample UNIVERSITY database
schema in the EER model and summarizes the EER model concepts by giving formal
definitions. We will use the terms object and entity interchangeably in this chapter,
because many of these concepts are commonly used in object-oriented models.
We present the UML class diagram notation for representing specialization and gen-
eralization in Section 8.6, and briefly compare these with EER notation and con-
cepts. This serves as an example of alternative notation, and is a continuation of
Section 7.8, which presented basic UML class diagram notation that corresponds to
the basic ER model. In Section 8.7, we discuss the fundamental abstractions that are
used as the basis of many semantic data models. Section 8.8 summarizes the chapter.
For a detailed introduction to conceptual modeling, Chapter 8 should be consid-
ered a continuation of Chapter 7. However, if only a basic introduction to ER mod-
eling is desired, this chapter may be omitted. Alternatively, the reader may choose to
skip some or all of the later sections of this chapter (Sections 8.4 through 8.8).
8.1 Subclasses, Superclasses, and Inheritance
The EER model includes all the modeling concepts of the ER model that were pre-
sented in Chapter 7. In addition, it includes the concepts of subclass and
superclass and the related concepts of specialization and generalization (see
Sections 8.2 and 8.3). Another concept included in the EER model is that of a
category or union type (see Section 8.4), which is used to represent a collection of
objects (entities) that is the union of objects of different entity types. Associated
with these concepts is the important mechanism of attribute and relationship
inheritance. Unfortunately, no standard terminology exists for these concepts, so
we use the most common terminology. Alternative terminology is given in foot-
notes. We also describe a diagrammatic technique for displaying these concepts
when they arise in an EER schema. We call the resulting schema diagrams
enhanced ER or EER diagrams.
The first Enhanced ER (EER) model concept we take up is that of a subtype or
subclass of an entity type. As we discussed in Chapter 7, an entity type is used to
represent both a type of entity and the entity set or collection of entities of that type
that exist in the database. For example, the entity type EMPLOYEE describes the type
(that is, the attributes and relationships) of each employee entity, and also refers to
the current set of EMPLOYEE entities in the COMPANY database. In many cases an
entity type has numerous subgroupings or subtypes of its entities that are meaning-
ful and need to be represented explicitly because of their significance to the database
application. For example, the entities that are members of the EMPLOYEE entity
type may be distinguished further into SECRETARY, ENGINEER, MANAGER,
TECHNICIAN, SALARIED_EMPLOYEE, HOURLY_EMPLOYEE, and so on. The set of
entities in each of the latter groupings is a subset of the entities that belong to the
EMPLOYEE entity set, meaning that every entity that is a member of one of these

8.1
Subclasses, Superclasses, and Inheritance
247
subgroupings is also an employee. We call each of these subgroupings a subclass or
subtype of the EMPLOYEE entity type, and the EMPLOYEE entity type is called the
superclass or supertype for each of these subclasses. Figure 8.1 shows how to repre-
sent these concepts diagramatically in EER diagrams. (The circle notation in Figure
8.1 will be explained in Section 8.2.)
We call the relationship between a superclass and any one of its subclasses a
superclass/subclass or supertype/subtype or simply class/subclass relationship.3
In our previous example, EMPLOYEE/SECRETARY and EMPLOYEE/TECHNICIAN are
two class/subclass relationships. Notice that a member entity of the subclass repre-
sents the same real-world entity as some member of the superclass; for example, a
SECRETARY entity ‘Joan Logano’ is also the EMPLOYEE ‘Joan Logano.’ Hence, the
subclass member is the same as the entity in the superclass, but in a distinct specific
role. When we implement a superclass/subclass relationship in the database system,
however, we may represent a member of the subclass as a distinct database object—
say, a distinct record that is related via the key attribute to its superclass entity. In
Section 9.2, we discuss various options for representing superclass/subclass rela-
tionships in relational databases.
Figure 8.1
Fname
Minit
Lname
EER diagram
notation to represent
Name
Ssn
Birth_date
Add ress
subclasses and
specialization.
EMPLOYEE
d
d
Typing_speed
Tgrade
Eng_type
Pay_scale
SECRETARY
TECHNICIAN
ENGINEER
MANAGER
Salary
HOURLY_EMPLOYEE
SALARIED_EMPLOYEE
Three specializations of EMPLOYEE:
{SECRETARY, TECHNICIAN, ENGINEER}
MANAGES
BELONGS_TO
{MANAGER}
{HOURLY_EMPLOYEE, SALARIED_EMPLOYEE}
PROJECT
TRADE_UNION
3A class/subclass relationship is often called an IS-A (or IS-AN) relationship because of the way we
refer to the concept. We say a SECRETARY is an EMPLOYEE, a TECHNICIAN is an EMPLOYEE, and
so on.

248
Chapter 8 The Enhanced Entity-Relationship (EER) Model
An entity cannot exist in the database merely by being a member of a subclass; it
must also be a member of the superclass. Such an entity can be included optionally
as a member of any number of subclasses. For example, a salaried employee who is
also an engineer belongs to the two subclasses ENGINEER and
SALARIED_EMPLOYEE of the EMPLOYEE entity type. However, it is not necessary
that every entity in a superclass is a member of some subclass.
An important concept associated with subclasses (subtypes) is that of type inheri-
tance. Recall that the type of an entity is defined by the attributes it possesses and
the relationship types in which it participates. Because an entity in the subclass rep-
resents the same real-world entity from the superclass, it should possess values for
its specific attributes as well as values of its attributes as a member of the superclass.
We say that an entity that is a member of a subclass inherits all the attributes of the
entity as a member of the superclass. The entity also inherits all the relationships in
which the superclass participates. Notice that a subclass, with its own specific (or
local) attributes and relationships together with all the attributes and relationships
it inherits from the superclass, can be considered an entity type in its own right.4
8.2 Specialization and Generalization
8.2.1
Specialization
Specialization is the process of defining a set of subclasses of an entity type; this
entity type is called the superclass of the specialization. The set of subclasses that
forms a specialization is defined on the basis of some distinguishing characteristic
of the entities in the superclass. For example, the set of subclasses {SECRETARY,
ENGINEER, TECHNICIAN} is a specialization of the superclass EMPLOYEE that dis-
tinguishes among employee entities based on the job type of each employee entity.
We may have several specializations of the same entity type based on different dis-
tinguishing characteristics. For example, another specialization of the EMPLOYEE
entity type may yield the set of subclasses
{SALARIED_EMPLOYEE,
HOURLY_EMPLOYEE}; this specialization distinguishes among employees based on
the method of pay.
Figure 8.1 shows how we represent a specialization diagrammatically in an EER dia-
gram. The subclasses that define a specialization are attached by lines to a circle that
represents the specialization, which is connected in turn to the superclass. The
subset symbol on each line connecting a subclass to the circle indicates the direction
of the superclass/subclass relationship.5 Attributes that apply only to entities of a
particular subclass—such as TypingSpeed of SECRETARY—are attached to the rec-
tangle representing that subclass. These are called specific attributes (or local
4In some object-oriented programming languages, a common restriction is that an entity (or object) has
only one type. This is generally too restrictive for conceptual database modeling.
5There are many alternative notations for specialization; we present the UML notation in Section 8.6 and
other proposed notations in Appendix A.

8.2
Specialization and Generalization
249
attributes) of the subclass. Similarly, a subclass can participate in specific relation-
ship types, such as the HOURLY_EMPLOYEE subclass participating in the
BELONGS_TO relationship in Figure 8.1. We will explain the d symbol in the circles
in Figure 8.1 and additional EER diagram notation shortly.
Figure
8.2
shows a few entity instances that belong to subclasses of the
{SECRETARY, ENGINEER, TECHNICIAN} specialization. Again, notice that an entity
that belongs to a subclass represents the same real-world entity as the entity con-
nected to it in the EMPLOYEE superclass, even though the same entity is shown
twice; for example, e1 is shown in both EMPLOYEE and SECRETARY in Figure 8.2. As
the figure suggests, a superclass/subclass relationship such as EMPLOYEE/
SECRETARY somewhat resembles a 1:1 relationship at the instance level (see Figure
7.12). The main difference is that in a 1:1 relationship two distinct entities are
related, whereas in a superclass/subclass relationship the entity in the subclass is the
same real-world entity as the entity in the superclass but is playing a specialized
role—for example, an EMPLOYEE specialized in the role of SECRETARY, or an
EMPLOYEE specialized in the role of TECHNICIAN.
SECRETARY
Figure 8.2
Instances of a specialization.
e1
e4
e5
EMPLOYEE
e1
e2
ENGINEER
e3
e4
e2
e7
e5
e6
e7
e8
TECHNICIAN
e3
e8

250
Chapter 8 The Enhanced Entity-Relationship (EER) Model
There are two main reasons for including class/subclass relationships and specializa-
tions in a data model. The first is that certain attributes may apply to some but not all
entities of the superclass. A subclass is defined in order to group the entities to which
these attributes apply. The members of the subclass may still share the majority of
their attributes with the other members of the superclass. For example, in Figure 8.1
the SECRETARY subclass has the specific attribute Typing_speed, whereas the
ENGINEER subclass has the specific attribute Eng_type, but SECRETARY and
ENGINEER share their other inherited attributes from the EMPLOYEE entity type.
The second reason for using subclasses is that some relationship types may be par-
ticipated in only by entities that are members of the subclass. For example, if only
HOURLY_EMPLOYEES can belong to a trade union, we can represent that fact by
creating the subclass HOURLY_EMPLOYEE of EMPLOYEE and relating the subclass
to an entity type TRADE_UNION via the BELONGS_TO relationship type, as illus-
trated in Figure 8.1.
In summary, the specialization process allows us to do the following:
■ Define a set of subclasses of an entity type
■ Establish additional specific attributes with each subclass
■ Establish additional specific relationship types between each subclass and
other entity types or other subclasses
8.2.2
Generalization
We can think of a reverse process of abstraction in which we suppress the differences
among several entity types, identify their common features, and generalize them
into a single superclass of which the original entity types are special subclasses. For
example, consider the entity types CAR and TRUCK shown in Figure 8.3(a). Because
they have several common attributes, they can be generalized into the entity type
VEHICLE, as shown in Figure 8.3(b). Both CAR and TRUCK are now subclasses of the
generalized superclass VEHICLE. We use the term generalization to refer to the
process of defining a generalized entity type from the given entity types.
Notice that the generalization process can be viewed as being functionally the inverse
of the specialization process. Hence, in Figure 8.3 we can view {CAR, TRUCK} as a
specialization of VEHICLE, rather than viewing VEHICLE as a generalization of CAR
and TRUCK. Similarly, in Figure 8.1 we can view EMPLOYEE as a generalization of
SECRETARY, TECHNICIAN, and ENGINEER. A diagrammatic notation to distinguish
between generalization and specialization is used in some design methodologies. An
arrow pointing to the generalized superclass represents a generalization, whereas
arrows pointing to the specialized subclasses represent a specialization. We will not
use this notation because the decision as to which process is followed in a particular
situation is often subjective. Appendix A gives some of the suggested alternative dia-
grammatic notations for schema diagrams and class diagrams.
So far we have introduced the concepts of subclasses and superclass/subclass rela-
tionships, as well as the specialization and generalization processes. In general, a

8.3
Constraints and Characteristics of Specialization and Generalization Hierarchies
251
(a)
No_of_passengers
No_of_axles
Max_speed
Tonnage
CAR
Price
Price
TRUCK
Vehicle_id
Vehicle_id
License_plate_no
License_plate_no
(b)
Vehicle_id
Price
License_plate_no
VEHICLE
d
No_of_passengers
No_of_axles
Max_speed
Tonnage
CAR
TRUCK
Figure 8.3
Generalization. (a) Two entity types, CAR and TRUCK. (b)
Generalizing CAR and TRUCK into the superclass VEHICLE.
superclass or subclass represents a collection of entities of the same type and hence
also describes an entity type; that is why superclasses and subclasses are all shown in
rectangles in EER diagrams, like entity types. Next, we discuss the properties of spe-
cializations and generalizations in more detail.
8.3 Constraints and Characteristics
of Specialization and Generalization
Hierarchies
First, we discuss constraints that apply to a single specialization or a single general-
ization. For brevity, our discussion refers only to specialization even though it
applies to both specialization and generalization. Then, we discuss differences
between specialization/generalization lattices (multiple inheritance) and hierarchies
(single inheritance), and elaborate on the differences between the specialization and
generalization processes during conceptual database schema design.
8.3.1
Constraints on Specialization and Generalization
In general, we may have several specializations defined on the same entity type (or
superclass), as shown in Figure 8.1. In such a case, entities may belong to subclasses

252
Chapter 8 The Enhanced Entity-Relationship (EER) Model
in each of the specializations. However, a specialization may also consist of a single
subclass only, such as the {MANAGER} specialization in Figure 8.1; in such a case, we
do not use the circle notation.
In some specializations we can determine exactly the entities that will become
members of each subclass by placing a condition on the value of some attribute of
the superclass. Such subclasses are called predicate-defined (or condition-defined)
subclasses. For example, if the EMPLOYEE entity type has an attribute Job_type, as
shown in Figure 8.4, we can specify the condition of membership in the
SECRETARY subclass by the condition (Job_type = ‘Secretary’), which we call the
defining predicate of the subclass. This condition is a constraint specifying that
exactly those entities of the EMPLOYEE entity type whose attribute value for
Job_type is ‘Secretary’ belong to the subclass. We display a predicate-defined subclass
by writing the predicate conditioext to the line that connects the subclass to the
specialization circle.
If all subclasses in a specialization have their membership condition on the same
attribute of the superclass, the specialization itself is called an attribute-defined spe-
cialization, and the attribute is called the defining attribute of the specialization.6 In
this case, all the entities with the same value for the attribute belong to the same sub-
class. We display an attribute-defined specialization by placing the defining attribute
name next to the arc from the circle to the superclass, as shown in Figure 8.4.
When we do not have a condition for determining membership in a subclass, the
subclass is called user-defined. Membership in such a subclass is determined by the
database users when they apply the operation to add an entity to the subclass; hence,
membership is specified individually for each entity by the user, not by any condition
that may be evaluated automatically.
Figure 8.4
Fname
Minit
Lname
EER diagram notation
for an attribute-defined
Name
Ssn
Birth_date
Address
Job_type
specialization on
Job_type.
EMPLOYEE
Job_type
d
‘Secretary’
‘Engineer’
Typing_speed
Tgrade
Eng_type
‘Technician’
SECRETARY
TECHNICIAN
ENGINEER
6Such an attribute is called a discriminator in UML terminology.

8.3
Constraints and Characteristics of Specialization and Generalization Hierarchies
253
Two other constraints may apply to a specialization. The first is the disjointness (or
disjointedness) constraint, which specifies that the subclasses of the specialization
must be disjoint. This means that an entity can be a member of at most one of the
subclasses of the specialization. A specialization that is attribute-defined implies the
disjointness constraint (if the attribute used to define the membership predicate is
single-valued). Figure 8.4 illustrates this case, where the d in the circle stands for
disjoint. The d notation also applies to user-defined subclasses of a specialization
that must be disjoint, as illustrated by the specialization {HOURLY_EMPLOYEE,
SALARIED_EMPLOYEE} in Figure 8.1. If the subclasses are not constrained to be dis-
joint, their sets of entities may be overlapping; that is, the same (real-world) entity
may be a member of more than one subclass of the specialization. This case, which
is the default, is displayed by placing an o in the circle, as shown in Figure 8.5.
The second constraint on specialization is called the completeness (or totalness)
constraint, which may be total or partial. A total specialization constraint specifies
that every entity in the superclass must be a member of at least one subclass in the
specialization. For example, if every EMPLOYEE must be either an
HOURLY_EMPLOYEE or a SALARIED_EMPLOYEE, then the specialization
{HOURLY_EMPLOYEE, SALARIED_EMPLOYEE} in Figure 8.1 is a total specialization
of EMPLOYEE. This is shown in EER diagrams by using a double line to connect the
superclass to the circle. A single line is used to display a partial specialization,
which allows an entity not to belong to any of the subclasses. For example, if some
EMPLOYEE entities do not belong to any of the subclasses
{SECRETARY,
ENGINEER, TECHNICIAN} in Figures 8.1 and 8.4, then that specialization is partial.7
Notice that the disjointness and completeness constraints are independent. Hence,
we have the following four possible constraints on specialization:
■ Disjoint, total
■ Disjoint, partial
■ Overlapping, total
■ Overlapping, partial
Figure 8.5
Part_no
Description
EER diagram notation
for an overlapping
Manufacture_date
PART
(nondisjoint)
specialization.
Batch_no
o
Supplier_name
Drawing_no
List_price
MANUFACTURED_PART
PURCHASED_PART
7The notation of using single or double lines is similar to that for partial or total participation of an entity
type in a relationship type, as described in Chapter 7.

254
Chapter 8 The Enhanced Entity-Relationship (EER) Model
Of course, the correct constraint is determined from the real-world meaning that
applies to each specialization. In general, a superclass that was identified through
the generalization process usually is total, because the superclass is derived from the
subclasses and hence contains only the entities that are in the subclasses.
Certain insertion and deletion rules apply to specialization (and generalization) as a
consequence of the constraints specified earlier. Some of these rules are as follows:
■ Deleting an entity from a superclass implies that it is automatically deleted
from all the subclasses to which it belongs.
■ Inserting an entity in a superclass implies that the entity is mandatorily
inserted in all predicate-defined (or attribute-defined) subclasses for which
the entity satisfies the defining predicate.
■ Inserting an entity in a superclass of a total specialization implies that the
entity is mandatorily inserted in at least one of the subclasses of the special-
ization.
The reader is encouraged to make a complete list of rules for insertions and dele-
tions for the various types of specializations.
8.3.2
Specialization and Generalization Hierarchies
and Lattices
A subclass itself may have further subclasses specified on it, forming a hierarchy or a
lattice of specializations. For example, in Figure 8.6 ENGINEER is a subclass of
EMPLOYEE and is also a superclass of ENGINEERING_MANAGER; this represents
the real-world constraint that every engineering manager is required to be an engi-
neer. A specialization hierarchy has the constraint that every subclass participates
as a subclass in only one class/subclass relationship; that is, each subclass has only
Figure 8.6
A specialization lattice with shared subclass
ENGINEERING_MANAGER.
EMPLOYEE
d
d
SECRETARY
TECHNICIAN
ENGINEER
MANAGER
HOURLY_EMPLOYEE
SALARIED_EMPLOYEE
ENGINEERING_MANAGER

8.3
Constraints and Characteristics of Specialization and Generalization Hierarchies
255
one parent, which results in a tree structure or strict hierarchy. In contrast, for a
specialization lattice, a subclass can be a subclass in more than one class/subclass
relationship. Hence, Figure 8.6 is a lattice.
Figure 8.7 shows another specialization lattice of more than one level. This may be
part of a conceptual schema for a UNIVERSITY database. Notice that this arrange-
ment would have been a hierarchy except for the STUDENT_ASSISTANT subclass,
which is a subclass in two distinct class/subclass relationships.
The requirements for the part of the UNIVERSITY database shown in Figure 8.7 are
the following:
1. The database keeps track of three types of persons: employees, alumni, and
students. A person can belong to one, two, or all three of these types. Each
person has a name, SSN, sex, address, and birth date.
2. Every employee has a salary, and there are three types of employees: faculty,
staff, and student assistants. Each employee belongs to exactly one of these
types. For each alumnus, a record of the degree or degrees that he or she
Figure 8.7
Name
Sex
Address
A specialization lattice
with multiple inheritance
Ssn
PERSON
Birth_date
for a UNIVERSITY
database.
o
Salary
Major_dept
EMPLOYEE
ALUMNUS
STUDENT
Degrees
Y
ear
Degree
Major
d
d
Percent_time
STAFF
FACULTY
STUDENT_
GRADUATE_
UNDERGRADUATE_
ASSISTANT
STUDENT
STUDENT
Position
Rank
Degree_program
Class
d
Project
Course
RESEARCH_ASSISTANT
TEACHING_ASSISTANT

256
Chapter 8 The Enhanced Entity-Relationship (EER) Model
earned at the university is kept, including the name of the degree, the year
granted, and the major department. Each student has a major department.
3. Each faculty has a rank, whereas each staff member has a staff position.
Student assistants are classified further as either research assistants or teach-
ing assistants, and the percent of time that they work is recorded in the data-
base. Research assistants have their research project stored, whereas teaching
assistants have the current course they work on.
4. Students are further classified as either graduate or undergraduate, with the
specific attributes degree program (M.S., Ph.D., M.B.A., and so on) for
graduate students and class (freshman, sophomore, and so on) for under-
graduates.
In Figure 8.7, all person entities represented in the database are members of the
PERSON entity type, which is specialized into the subclasses
{EMPLOYEE,
ALUMNUS, STUDENT}. This specialization is overlapping; for example, an alumnus
may also be an employee and may also be a student pursuing an advanced degree.
The subclass STUDENT is the superclass for the specialization
{GRADUATE_STUDENT, UNDERGRADUATE_STUDENT}, while EMPLOYEE is the
superclass for the specialization {STUDENT_ASSISTANT, FACULTY, STAFF}. Notice
that STUDENT_ASSISTANT is also a subclass of STUDENT. Finally,
STUDENT_ASSISTANT is the superclass for the specialization into
{RESEARCH_ASSISTANT, TEACHING_ASSISTANT}.
In such a specialization lattice or hierarchy, a subclass inherits the attributes not
only of its direct superclass, but also of all its predecessor superclasses all the way to
the root of the hierarchy or lattice if necessary. For example, an entity in
GRADUATE_STUDENT inherits all the attributes of that entity as a STUDENT and as
a PERSON. Notice that an entity may exist in several leaf nodes of the hierarchy,
where a leaf node is a class that has no subclasses of its own. For example, a member
of GRADUATE_STUDENT may also be a member of RESEARCH_ASSISTANT.
A subclass with more than one superclass is called a shared subclass, such as
ENGINEERING_MANAGER in Figure 8.6. This leads to the concept known as
multiple inheritance, where the shared subclass ENGINEERING_MANAGER directly
inherits attributes and relationships from multiple classes. Notice that the existence
of at least one shared subclass leads to a lattice (and hence to multiple inheritance);
if no shared subclasses existed, we would have a hierarchy rather than a lattice and
only single inheritance would exist. An important rule related to multiple inheri-
tance can be illustrated by the example of the shared subclass STUDENT_ASSISTANT
in Figure 8.7, which inherits attributes from both EMPLOYEE and STUDENT. Here,
both EMPLOYEE and STUDENT inherit the same attributes from PERSON. The rule
states that if an attribute (or relationship) originating in the same superclass
(PERSON) is inherited more than once via different paths (EMPLOYEE and
STUDENT) in the lattice, then it should be included only once in the shared subclass
(STUDENT_ASSISTANT). Hence, the attributes of PERSON are inherited only once in
the STUDENT_ASSISTANT subclass in Figure 8.7.

8.3
Constraints and Characteristics of Specialization and Generalization Hierarchies
257
It is important to note here that some models and languages are limited to single
inheritance and do not allow multiple inheritance (shared subclasses). It is also
important to note that some models do not allow an entity to have multiple types,
and hence an entity can be a member of only one leaf class.8 In such a model, it is
necessary to create additional subclasses as leaf nodes to cover all possible combina-
tions of classes that may have some entity that belongs to all these classes simultane-
ously. For example, in the overlapping specialization of PERSON into {EMPLOYEE,
ALUMNUS, STUDENT} (or {E, A, S} for short), it would be necessary to create seven
subclasses of PERSON in order to cover all possible types of entities: E, A, S, E_A,
E_S, A_S, and E_A_S. Obviously, this can lead to extra complexity.
Although we have used specialization to illustrate our discussion, similar concepts
apply equally to generalization, as we mentioned at the beginning of this section.
Hence, we can also speak of generalization hierarchies and generalization lattices.
8.3.3
Utilizing Specialization and Generalization
in Refining Conceptual Schemas
Now we elaborate on the differences between the specialization and generalization
processes, and how they are used to refine conceptual schemas during conceptual
database design. In the specialization process, we typically start with an entity type
and then define subclasses of the entity type by successive specialization; that is, we
repeatedly define more specific groupings of the entity type. For example, when
designing the specialization lattice in Figure 8.7, we may first specify an entity type
PERSON for a university database. Then we discover that three types of persons will
be represented in the database: university employees, alumni, and students. We cre-
ate the specialization {EMPLOYEE, ALUMNUS, STUDENT} for this purpose and
choose the overlapping constraint, because a person may belong to more than one
of the subclasses. We specialize EMPLOYEE further into
{STAFF, FACULTY,
STUDENT_ASSISTANT}, and specialize STUDENT into {GRADUATE_STUDENT,
UNDERGRADUATE_STUDENT}. Finally, we specialize STUDENT_ASSISTANT into
{RESEARCH_ASSISTANT, TEACHING_ASSISTANT}. This successive specialization
corresponds to a top-down conceptual refinement process during conceptual
schema design. So far, we have a hierarchy; then we realize that
STUDENT_ASSISTANT is a shared subclass, since it is also a subclass of STUDENT,
leading to the lattice.
It is possible to arrive at the same hierarchy or lattice from the other direction. In
such a case, the process involves generalization rather than specialization and corre-
sponds to a bottom-up conceptual synthesis. For example, the database designers
may first discover entity types such as STAFF, FACULTY, ALUMNUS,
GRADUATE_STUDENT, UNDERGRADUATE_STUDENT, RESEARCH_ASSISTANT,
TEACHING_ASSISTANT, and so on; then they generalize {GRADUATE_STUDENT,
8In some models, the class is further restricted to be a leaf node in the hierarchy or lattice.

258
Chapter 8 The Enhanced Entity-Relationship (EER) Model
UNDERGRADUATE_STUDENT} into STUDENT; then they generalize
{RESEARCH_ASSISTANT, TEACHING_ASSISTANT} into STUDENT_ASSISTANT; then
they generalize {STAFF, FACULTY, STUDENT_ASSISTANT} into EMPLOYEE; and
finally they generalize {EMPLOYEE, ALUMNUS, STUDENT} into PERSON.
In structural terms, hierarchies or lattices resulting from either process may be iden-
tical; the only difference relates to the manner or order in which the schema super-
classes and subclasses were created during the design process. In practice, it is likely
that neither the generalization process nor the specialization process is followed
strictly, but that a combination of the two processes is employed. New classes are
continually incorporated into a hierarchy or lattice as they become apparent to users
and designers. Notice that the notion of representing data and knowledge by using
superclass/subclass hierarchies and lattices is quite common in knowledge-based sys-
tems and expert systems, which combine database technology with artificial intelli-
gence techniques. For example, frame-based knowledge representation schemes
closely resemble class hierarchies. Specialization is also common in software engi-
neering design methodologies that are based on the object-oriented paradigm.
8.4 Modeling of UNION Types Using Categories
All of the superclass/subclass relationships we have seen thus far have a single super-
class. A shared subclass such as ENGINEERING_MANAGER in the lattice in Figure
8.6 is the subclass in three distinct superclass/subclass relationships, where each of
the three relationships has a single superclass. However, it is sometimes necessary to
represent a single superclass/subclass relationship with more than one superclass,
where the superclasses represent different entity types. In this case, the subclass will
represent a collection of objects that is a subset of the UNION of distinct entity types;
we call such a subclass a union type or a category.9
For example, suppose that we have three entity types: PERSON, BANK, and
COMPANY.In a database for motor vehicle registration, an owner of a vehicle can be
a person, a bank (holding a lien on a vehicle), or a company. We need to create a
class (collection of entities) that includes entities of all three types to play the role of
vehicle owner. A category (union type) OWNER that is a subclass of the UNION of the
three entity sets of COMPANY, BANK, and PERSON can be created for this purpose.
We display categories in an EER diagram as shown in Figure 8.8. The superclasses
COMPANY, BANK, and PERSON are connected to the circle with the ∪ symbol,
which stands for the set union operation. An arc with the subset symbol connects the
circle to the (subclass) OWNER category. If a defining predicate is needed, it is dis-
played next to the line from the superclass to which the predicate applies. In Figure
8.8 we have two categories: OWNER, which is a subclass of the union of PERSON,
BANK, and COMPANY; and REGISTERED_VEHICLE, which is a subclass of the union
of CAR and TRUCK.
9Our use of the term category is based on the ECR (Entity-Category-Relationship) model (Elmasri et al.
1985).

8.4
Modeling of UNION Types Using Categories
259
Bname
Baddress
BANK
Driver_license_no
Name
Address
Cname
Caddress
Ssn
PERSON
COMPANY
U
OWNER
Lien_or_regular
M
OWNS
Purchase_date
N
License_plate_no
REGISTERED_VEHICLE
U
Vehicle_id
Vehicle_id
Cstyle
Tonnage
Cmake
CAR
TRUCK
Tmake
Figure 8.8
Cyear
Tyear
Two categories (union
types): OWNER and
Cmodel
Tmodel
REGISTERED_VEHICLE.
A category has two or more superclasses that may represent distinct entity types,
whereas other superclass/subclass relationships always have a single superclass. To
better understand the difference, we can compare a category, such as OWNER in
Figure 8.8, with the ENGINEERING_MANAGER shared subclass in Figure 8.6. The
latter is a subclass of each of the three superclasses ENGINEER, MANAGER, and
SALARIED_EMPLOYEE, so an entity that is a member of ENGINEERING_MANAGER
must exist in all three. This represents the constraint that an engineering manager
must be an ENGINEER, a MANAGER, and a SALARIED_EMPLOYEE; that is,
ENGINEERING_MANAGER is a subset of the intersection of the three classes (sets of
entities). On the other hand, a category is a subset of the union of its superclasses.
Hence, an entity that is a member of OWNER must exist in only one of the super-

260
Chapter 8 The Enhanced Entity-Relationship (EER) Model
classes. This represents the constraint that an OWNER may be a COMPANY, a BANK,
or a PERSON in Figure 8.8.
Attribute inheritance works more selectively in the case of categories. For example,
in Figure 8.8 each OWNER entity inherits the attributes of a COMPANY, a PERSON,
or a BANK, depending on the superclass to which the entity belongs. On the other
hand, a shared subclass such as ENGINEERING_MANAGER (Figure 8.6) inherits all
the attributes of its superclasses SALARIED_EMPLOYEE, ENGINEER, and
MANAGER.
It is interesting to note the difference between the category REGISTERED_VEHICLE
(Figure 8.8) and the generalized superclass VEHICLE (Figure 8.3(b)). In Figure
8.3(b), every car and every truck is a VEHICLE; but in Figure
8.8, the
REGISTERED_VEHICLE category includes some cars and some trucks but not neces-
sarily all of them (for example, some cars or trucks may not be registered). In gen-
eral, a specialization or generalization such as that in Figure 8.3(b), if it were partial,
would not preclude VEHICLE from containing other types of entities, such as
motorcycles. However, a category such as REGISTERED_VEHICLE in Figure 8.8
implies that only cars and trucks, but not other types of entities, can be members of
REGISTERED_VEHICLE.
A category can be total or partial. A total category holds the union of all entities in
its superclasses, whereas a partial category can hold a subset of the union. A total cat-
egory is represented diagrammatically by a double line connecting the category and
the circle, whereas a partial category is indicated by a single line.
The superclasses of a category may have different key attributes, as demonstrated by
the OWNER category in Figure 8.8, or they may have the same key attribute, as
demonstrated by the REGISTERED_VEHICLE category. Notice that if a category is
total (not partial), it may be represented alternatively as a total specialization (or a
total generalization). In this case, the choice of which representation to use is sub-
jective. If the two classes represent the same type of entities and share numerous
attributes, including the same key attributes, specialization/generalization is pre-
ferred; otherwise, categorization (union type) is more appropriate.
It is important to note that some modeling methodologies do not have union types.
In these models, a union type must be represented in a roundabout way (see Section
9.2).
8.5 A Sample UNIVERSITY EER Schema,
Design Choices, and Formal Definitions
In this section, we first give an example of a database schema in the EER model to
illustrate the use of the various concepts discussed here and in Chapter 7. Then, we
discuss design choices for conceptual schemas, and finally we summarize the EER
model concepts and define them formally in the same manner in which we formally
defined the concepts of the basic ER model in Chapter 7.

8.5
A Sample UNIVERSITY EER Schema, Design Choices, and Formal Definitions
261
8.5.1
The UNIVERSITY Database Example
For our sample database application, consider a UNIVERSITY database that keeps
track of students and their majors, transcripts, and registration as well as of the uni-
versity’s course offerings. The database also keeps track of the sponsored research
projects of faculty and graduate students. This schema is shown in Figure 8.9. A dis-
cussion of the requirements that led to this schema follows.
For each person, the database maintains information on the person’s Name [Name],
Social Security number [Ssn], address [Address], sex [Sex], and birth date [Bdate].
Two subclasses of the PERSON entity type are identified: FACULTY and STUDENT.
Specific attributes of FACULTY are rank [Rank] (assistant, associate, adjunct,
research, visiting, and so on), office [Foffice], office phone [Fphone], and salary
[Salary]. All faculty members are related to the academic department(s) with which
they are affiliated [BELONGS] (a faculty member can be associated with several
departments, so the relationship is M:N). A specific attribute of STUDENT is [Class]
(freshman=1, sophomore=2, …, graduate student=5). Each STUDENT is also related
to his or her major and minor departments (if known) [MAJOR] and [MINOR], to
the course sections he or she is currently attending [REGISTERED], and to the
courses completed [TRANSCRIPT]. Each TRANSCRIPT instance includes the grade
the student received [Grade] in a section of a course.
GRAD_STUDENT is a subclass of STUDENT, with the defining predicate Class = 5.
For each graduate student, we keep a list of previous degrees in a composite, multi-
valued attribute [Degrees]. We also relate the graduate student to a faculty advisor
[ADVISOR] and to a thesis committee [COMMITTEE], if one exists.
An academic department has the attributes name [Dname], telephone [Dphone], and
office number [Office] and is related to the faculty member who is its chairperson
[CHAIRS] and to the college to which it belongs [CD]. Each college has attributes
college name [Cname], office number [Coffice], and the name of its dean [Dean].
A course has attributes course number [C#], course name [Cname], and course
description [Cdesc]. Several sections of each course are offered, with each section
having the attributes sectioumber [Sec#] and the year and quarter in which the
section was offered ([Year] and [Qtr]).10 Sectioumbers uniquely identify each sec-
tion. The sections being offered during the current quarter are in a subclass
CURRENT_SECTION of SECTION, with the defining predicate Qtr = Current_qtr and
Year = Current_year. Each section is related to the instructor who taught or is teach-
ing it ([TEACH]), if that instructor is in the database.
The category INSTRUCTOR_RESEARCHER is a subset of the union of FACULTY and
GRAD_STUDENT and includes all faculty, as well as graduate students who are sup-
ported by teaching or research. Finally, the entity type GRANT keeps track of
research grants and contracts awarded to the university. Each grant has attributes
grant title [Title], grant number [No], the awarding agency [Agency], and the starting
10We assume that the quarter system rather than the semester system is used in this university.

262
Chapter 8 The Enhanced Entity-Relationship (EER) Model
Fname Minit
Lname
Ssn
Bdate
Sex
No Street Apt_no City State
Zip
Name
PERSON
Address
d
Fphone Salary
Class
Foffice
College
Degree Year
Rank
FACULTY
ADVISOR
STUDENT
1
N
Degrees
1
Class=5
M
N
COMMITTEE
GRAD_STUDENT
PI
N
Title
No
U
GRANT
Agency
N
St_date
MINOR
N
N
Start
1
MAJOR
SUPPORT
Time
M
M
End
M
1
Grade
1
BELONGS
INSTRUCTOR_RESEARCHER
REGISTERED
M
1
N
CHAIRS
N
TRANSCRIPT
TEACH
1
N
N
CURRENT_SECTION
Qtr = Current_qtr and
Year = Current_year
SECTION
DEPARTMENT
Qtr
N
Sec# Year
Dname
Office
CS
Dphone
N
1
1
1
N
CD
COLLEGE
DC
COURSE
Figure 8.9
Cname
Coffice
C#
Cdesc
An EER conceptual schema
Dean
Cname
for a UNIVERSITY database.

8.5
A Sample UNIVERSITY EER Schema, Design Choices, and Formal Definitions
263
date [St_date]. A grant is related to one principal investigator [PI] and to all
researchers it supports [SUPPORT]. Each instance of support has as attributes the
starting date of support [Start], the ending date of the support (if known) [End],
and the percentage of time being spent on the project [Time] by the researcher being
supported.
8.5.2
Design Choices for Specialization/Generalization
It is not always easy to choose the most appropriate conceptual design for a database
application. In Section 7.7.3, we presented some of the typical issues that confront a
database designer when choosing among the concepts of entity types, relationship
types, and attributes to represent a particular miniworld situation as an ER schema.
In this section, we discuss design guidelines and choices for the EER concepts of
specialization/generalization and categories (union types).
As we mentioned in Section 7.7.3, conceptual database design should be considered
as an iterative refinement process until the most suitable design is reached. The fol-
lowing guidelines can help to guide the design process for EER concepts:
■
In general, many specializations and subclasses can be defined to make the
conceptual model accurate. However, the drawback is that the design
becomes quite cluttered. It is important to represent only those subclasses
that are deemed necessary to avoid extreme cluttering of the conceptual
schema.
■
If a subclass has few specific (local) attributes and no specific relationships, it
can be merged into the superclass. The specific attributes would hold NULL
values for entities that are not members of the subclass. A type attribute
could specify whether an entity is a member of the subclass.
■
Similarly, if all the subclasses of a specialization/generalization have few spe-
cific attributes and no specific relationships, they can be merged into the
superclass and replaced with one or more type attributes that specify the sub-
class or subclasses that each entity belongs to (see Section 9.2 for how this
criterion applies to relational databases).
■
Union types and categories should generally be avoided unless the situation
definitely warrants this type of construct, which does occur in some practi-
cal situations. If possible, we try to model using specialization/generalization
as discussed at the end of Section 8.4.
■
The choice of disjoint/overlapping and total/partial constraints on special-
ization/generalization is driven by the rules in the miniworld being modeled.
If the requirements do not indicate any particular constraints, the default
would generally be overlapping and partial, since this does not specify any
restrictions on subclass membership.

264
Chapter 8 The Enhanced Entity-Relationship (EER) Model
As an example of applying these guidelines, consider Figure 8.6, where no specific
(local) attributes are shown. We could merge all the subclasses into the EMPLOYEE
entity type, and add the following attributes to EMPLOYEE:
■ An attribute Job_type whose value set {‘Secretary’, ‘Engineer’, ‘Technician’}
would indicate which subclass in the first specialization each employee
belongs to.
■ An attribute Pay_method whose value set {‘Salaried’, ‘Hourly’} would indicate
which subclass in the second specialization each employee belongs to.
■ An attribute Is_a_manager whose value set {‘Yes’, ‘No’} would indicate
whether an individual employee entity is a manager or not.
8.5.3
Formal Definitions for the EER Model Concepts
We now summarize the EER model concepts and give formal definitions. A class11
is a set or collection of entities; this includes any of the EER schema constructs of
group entities, such as entity types, subclasses, superclasses, and categories. A
subclass S is a class whose entities must always be a subset of the entities in another
class, called the superclass C of the superclass/subclass (or IS-A) relationship. We
denote such a relationship by C/S. For such a superclass/subclass relationship, we
must always have
S⊆C
A specialization Z = {S1, S2, …, Sn} is a set of subclasses that have the same super-
class G; that is, G/Si is a superclass/subclass relationship for i = 1, 2, …, n. G is called
a generalized entity type
(or the superclass of the specialization, or a
generalization of the subclasses {S1, S2, …, Sn} ). Z is said to be total if we always (at
any point in time) have
n
∪Si =G
i=1
Otherwise, Z is said to be partial. Z is said to be disjoint if we always have
Si ∩ Sj = ∅ (empty set) for i ≠ j
Otherwise, Z is said to be overlapping.
A subclass S of C is said to be predicate-defined if a predicate p on the attributes of
C is used to specify which entities in C are members of S; that is, S = C[p], where
C[p] is the set of entities in C that satisfy p. A subclass that is not defined by a pred-
icate is called user-defined.
A specialization Z (or generalization G) is said to be attribute-defined if a predicate
(A = ci), where A is an attribute of G and ci is a constant value from the domain of A,
11The use of the word class here differs from its more common use in object-oriented programming lan-
guages such as C++. In C++, a class is a structured type definition along with its applicable functions
(operations).

8.6
Example of Other Notation: Representing Specialization and Generalization in UML Class Diagrams
265
is used to specify membership in each subclass Si in Z. Notice that if ci ≠ cj for i ≠ j,
and A is a single-valued attribute, then the specialization will be disjoint.
A category T is a class that is a subset of the union of n defining superclasses D1, D2,
…, Dn, n > 1, and is formally specified as follows:
T ⊆ (D1 ∪ D2 … ∪ Dn)
A predicate pi on the attributes of Di can be used to specify the members of each Di
that are members of T. If a predicate is specified on every Di, we get
T = (D1[p1] ∪ D2[p2] … ∪ Dn[pn])
We should now extend the definition of relationship type given in Chapter 7 by
allowing any class—not only any entity type—to participate in a relationship.
Hence, we should replace the words entity type with class in that definition. The
graphical notation of EER is consistent with ER because all classes are represented
by rectangles.
8.6 Example of Other Notation: Representing
Specialization and Generalization in UML
Class Diagrams
We now discuss the UML notation for generalization/specialization and inheri-
tance. We already presented basic UML class diagram notation and terminology in
Section 7.8. Figure 8.10 illustrates a possible UML class diagram corresponding to
the EER diagram in Figure 8.7. The basic notation for specialization/generalization
(see Figure 8.10) is to connect the subclasses by vertical lines to a horizontal line,
which has a triangle connecting the horizontal line through another vertical line to
the superclass. A blank triangle indicates a specialization/generalization with the
disjoint constraint, and a filled triangle indicates an overlapping constraint. The root
superclass is called the base class, and the subclasses (leaf nodes) are called leaf
classes.
The above discussion and example in Figure 8.10, and the presentation in Section
7.8 gave a brief overview of UML class diagrams and terminology. We focused on
the concepts that are relevant to ER and EER database modeling, rather than those
concepts that are more relevant to software engineering. In UML, there are many
details that we have not discussed because they are outside the scope of this book
and are mainly relevant to software engineering. For example, classes can be of var-
ious types:
■ Abstract classes define attributes and operations but do not have objects cor-
responding to those classes. These are mainly used to specify a set of attrib-
utes and operations that can be inherited.
■ Concrete classes can have objects (entities) instantiated to belong to the
class.
■ Template classes specify a template that can be further used to define other
classes.

266
Chapter 8 The Enhanced Entity-Relationship (EER) Model
PERSON
Name
Ssn
Birth_date
Sex
Address
age
EMPLOYEE
ALUMNUS
DEGREE
STUDENT
Salary
Year
Major_dept
1
*
Degree
hire_emp
new_alumnus
change_major
Major
STAFF
FACULTY
STUDENT_ASSISTANT
Position
Rank
Percent_time
hire_staff
promote
hire_student
RESEARCH_
TEACHING_
GRADUATE_
UNDERGRADUATE_
ASSISTANT
ASSISTANT
STUDENT
STUDENT
Project
Course
Degree_program
Class
change_project
assign_to_course
change_degree_program
change_classification
Figure 8.10
A UML class diagram corresponding to the EER diagram in Figure 8.7,
illustrating UML notation for specialization/generalization.
In database design, we are mainly concerned with specifying concrete classes whose
collections of objects are permanently (or persistently) stored in the database. The
bibliographic notes at the end of this chapter give some references to books that
describe complete details of UML. Additional material related to UML is covered in
Chapter 10.

8.7
Data Abstraction, Knowledge Representation, and Ontology Concepts
267
8.7 Data Abstraction, Knowledge
Representation, and Ontology Concepts
In this section we discuss in general terms some of the modeling concepts that we
described quite specifically in our presentation of the ER and EER models in
Chapter 7 and earlier in this chapter. This terminology is not only used in concep-
tual data modeling but also in artificial intelligence literature when discussing
knowledge representation (KR). This section discusses the similarities and differ-
ences between conceptual modeling and knowledge representation, and introduces
some of the alternative terminology and a few additional concepts.
The goal of KR techniques is to develop concepts for accurately modeling some
domain of knowledge by creating an ontology12 that describes the concepts of the
domain and how these concepts are interrelated. Such an ontology is used to store
and manipulate knowledge for drawing inferences, making decisions, or answering
questions. The goals of KR are similar to those of semantic data models, but there
are some important similarities and differences between the two disciplines:
■
Both disciplines use an abstraction process to identify common properties
and important aspects of objects in the miniworld (also known as domain of
discourse in KR) while suppressing insignificant differences and unimpor-
tant details.
■
Both disciplines provide concepts, relationships, constraints, operations, and
languages for defining data and representing knowledge.
■
KR is generally broader in scope than semantic data models. Different forms
of knowledge, such as rules (used in inference, deduction, and search),
incomplete and default knowledge, and temporal and spatial knowledge, are
represented in KR schemes. Database models are being expanded to include
some of these concepts (see Chapter 26).
■
KR schemes include reasoning mechanisms that deduce additional facts
from the facts stored in a database. Hence, whereas most current database
systems are limited to answering direct queries, knowledge-based systems
using KR schemes can answer queries that involve inferences over the stored
data. Database technology is being extended with inference mechanisms (see
Section 26.5).
■
Whereas most data models concentrate on the representation of database
schemas, or meta-knowledge, KR schemes often mix up the schemas with
the instances themselves in order to provide flexibility in representing excep-
tions. This often results in inefficiencies when these KR schemes are imple-
mented, especially when compared with databases and when a large amount
of data (facts) needs to be stored.
12An ontology is somewhat similar to a conceptual schema, but with more knowledge, rules, and excep-
tions.

268
Chapter 8 The Enhanced Entity-Relationship (EER) Model
We now discuss four abstraction concepts that are used in semantic data models,
such as the EER model as well as in KR schemes: (1) classification and instantiation,
(2) identification, (3) specialization and generalization, and (4) aggregation and
association. The paired concepts of classification and instantiation are inverses of
one another, as are generalization and specialization. The concepts of aggregation
and association are also related. We discuss these abstract concepts and their rela-
tion to the concrete representations used in the EER model to clarify the data
abstraction process and to improve our understanding of the related process of con-
ceptual schema design. We close the section with a brief discussion of ontology,
which is being used widely in recent knowledge representation research.
8.7.1
Classification and Instantiation
The process of classification involves systematically assigning similar objects/enti-
ties to object classes/entity types. We caow describe (in DB) or reason about (in
KR) the classes rather than the individual objects. Collections of objects that share
the same types of attributes, relationships, and constraints are classified into classes
in order to simplify the process of discovering their properties. Instantiation is the
inverse of classification and refers to the generation and specific examination of dis-
tinct objects of a class. An object instance is related to its object class by the IS-AN-
INSTANCE-OF or IS-A-MEMBER-OF relationship. Although EER diagrams do
not display instances, the UML diagrams allow a form of instantiation by permit-
ting the display of individual objects. We did not describe this feature in our intro-
duction to UML class diagrams.
In general, the objects of a class should have a similar type structure. However, some
objects may display properties that differ in some respects from the other objects of
the class; these exception objects also need to be modeled, and KR schemes allow
more varied exceptions than do database models. In addition, certain properties
apply to the class as a whole and not to the individual objects; KR schemes allow
such class properties. UML diagrams also allow specification of class properties.
In the EER model, entities are classified into entity types according to their basic
attributes and relationships. Entities are further classified into subclasses and cate-
gories based on additional similarities and differences (exceptions) among them.
Relationship instances are classified into relationship types. Hence, entity types,
subclasses, categories, and relationship types are the different concepts that are used
for classification in the EER model. The EER model does not provide explicitly for
class properties, but it may be extended to do so. In UML, objects are classified into
classes, and it is possible to display both class properties and individual objects.
Knowledge representation models allow multiple classification schemes in which
one class is an instance of another class (called a meta-class). Notice that this cannot
be represented directly in the EER model, because we have only two levels—classes
and instances. The only relationship among classes in the EER model is a super-
class/subclass relationship, whereas in some KR schemes an additional
class/instance relationship can be represented directly in a class hierarchy. An
instance may itself be another class, allowing multiple-level classification schemes.

8.7
Data Abstraction, Knowledge Representation, and Ontology Concepts
269
8.7.2
Identification
Identification is the abstraction process whereby classes and objects are made
uniquely identifiable by means of some identifier. For example, a class name
uniquely identifies a whole class within a schema. An additional mechanism is nec-
essary for telling distinct object instances apart by means of object identifiers.
Moreover, it is necessary to identify multiple manifestations in the database of the
same real-world object. For example, we may have a tuple <‘Matthew Clarke’,
‘610618’, ‘376-9821’> in a PERSON relation and another tuple <‘301-54-0836’, ‘CS’,
3.8> in a STUDENT relation that happen to represent the same real-world entity.
There is no way to identify the fact that these two database objects (tuples) represent
the same real-world entity unless we make a provision at design time for appropriate
cross-referencing to supply this identification. Hence, identification is needed at
two levels:
■ To distinguish among database objects and classes
■ To identify database objects and to relate them to their real-world counter-
parts
In the EER model, identification of schema constructs is based on a system of
unique names for the constructs in a schema. For example, every class in an EER
schema—whether it is an entity type, a subclass, a category, or a relationship type—
must have a distinct name. The names of attributes of a particular class must also be
distinct. Rules for unambiguously identifying attribute name references in a special-
ization or generalization lattice or hierarchy are needed as well.
At the object level, the values of key attributes are used to distinguish among entities
of a particular entity type. For weak entity types, entities are identified by a combi-
nation of their own partial key values and the entities they are related to in the
owner entity type(s). Relationship instances are identified by some combination of
the entities that they relate to, depending on the cardinality ratio specified.
8.7.3
Specialization and Generalization
Specialization is the process of classifying a class of objects into more specialized
subclasses. Generalization is the inverse process of generalizing several classes into
a higher-level abstract class that includes the objects in all these classes.
Specialization is conceptual refinement, whereas generalization is conceptual syn-
thesis. Subclasses are used in the EER model to represent specialization and general-
ization. We call the relationship between a subclass and its superclass an
IS-A-SUBCLASS-OF relationship, or simply an IS-A relationship. This is the same
as the IS-A relationship discussed earlier in Section 8.5.3.
8.7.4
Aggregation and Association
Aggregation is an abstraction concept for building composite objects from their
component objects. There are three cases where this concept can be related to the
EER model. The first case is the situation in which we aggregate attribute values of

270
Chapter 8 The Enhanced Entity-Relationship (EER) Model
an object to form the whole object. The second case is when we represent an aggre-
gation relationship as an ordinary relationship. The third case, which the EER
model does not provide for explicitly, involves the possibility of combining objects
that are related by a particular relationship instance into a higher-level aggregate
object. This is sometimes useful when the higher-level aggregate object is itself to be
related to another object. We call the relationship between the primitive objects and
their aggregate object IS-A-PART-OF; the inverse is called IS-A-COMPONENT-
OF. UML provides for all three types of aggregation.
The abstraction of association is used to associate objects from several independent
classes. Hence, it is somewhat similar to the second use of aggregation. It is repre-
sented in the EER model by relationship types, and in UML by associations. This
abstract relationship is called IS-ASSOCIATED-WITH.
In order to understand the different uses of aggregation better, consider the ER
schema shown in Figure 8.11(a), which stores information about interviews by job
applicants to various companies. The class COMPANY is an aggregation of the
attributes (or component objects) Cname (company name) and Caddress (company
address), whereas JOB_APPLICANT is an aggregate of Ssn, Name, Address, and Phone.
The relationship attributes Contact_name and Contact_phone represent the name
and phone number of the person in the company who is responsible for the inter-
view. Suppose that some interviews result in job offers, whereas others do not. We
would like to treat INTERVIEW as a class to associate it with JOB_OFFER. The
schema shown in Figure 8.11(b) is incorrect because it requires each interview rela-
tionship instance to have a job offer. The schema shown in Figure 8.11(c) is not
allowed because the ER model does not allow relationships among relationships.
One way to represent this situation is to create a higher-level aggregate class com-
posed of COMPANY, JOB_APPLICANT, and INTERVIEW and to relate this class to
JOB_OFFER, as shown in Figure 8.11(d). Although the EER model as described in
this book does not have this facility, some semantic data models do allow it and call
the resulting object a composite or molecular object. Other models treat entity
types and relationship types uniformly and hence permit relationships among rela-
tionships, as illustrated in Figure 8.11(c).
To represent this situation correctly in the ER model as described here, we need to
create a new weak entity type INTERVIEW, as shown in Figure 8.11(e), and relate it to
JOB_OFFER. Hence, we can always represent these situations correctly in the ER
model by creating additional entity types, although it may be conceptually more
desirable to allow direct representation of aggregation, as in Figure 8.11(d), or to
allow relationships among relationships, as in Figure 8.11(c).
The main structural distinction between aggregation and association is that when
an association instance is deleted, the participating objects may continue to exist.
However, if we support the notion of an aggregate object—for example, a CAR that
is made up of objects ENGINE, CHASSIS, and TIRES—then deleting the aggregate
CAR object amounts to deleting all its component objects.

8.7
Data Abstraction, Knowledge Representation, and Ontology Concepts
271
Figure 8.11
(a)
Contact_name Contact_phone
Aggregation. (a) The relation-
ship type INTERVIEW. (b)
Date
Cname Caddress
Name Ssn Phone Address
Including JOB_OFFER in a
ternary relationship type
(incorrect). (c) Having the
COMPANY
INTERVIEW
JOB_APPLICANT
RESULTS_IN relationship par-
ticipate in other relationships
(not allowed in ER). (d) Using
aggregation and a composite
(b)
(molecular) object (generally
COMPANY
INTERVIEW
JOB_APPLICANT
not allowed in ER but allowed
by some modeling tools). (e)
Correct representation in ER.
JOB_OFFER
(c)
COMPANY
INTERVIEW
JOB_APPLICANT
RESULTS_IN
JOB_OFFER
(d)
COMPANY
INTERVIEW
JOB_APPLICANT
RESULTS_IN
JOB_OFFER
(e)
Cname Caddress
Name Ssn Phone Address
COMPANY
CJI
JOB_APPLICANT
Contact_phone
Contact_name
Date
INTERVIEW
RESULTS_IN
JOB_OFFER

272
Chapter 8 The Enhanced Entity-Relationship (EER) Model
8.7.5
Ontologies and the Semantic Web
In recent years, the amount of computerized data and information available on the
Web has spiraled out of control. Many different models and formats are used. In
addition to the database models that we present in this book, much information is
stored in the form of documents, which have considerably less structure than data-
base information does. One ongoing project that is attempting to allow information
exchange among computers on the Web is called the Semantic Web, which attempts
to create knowledge representation models that are quite general in order to allow
meaningful information exchange and search among machines. The concept of
ontology is considered to be the most promising basis for achieving the goals of the
Semantic Web and is closely related to knowledge representation. In this section, we
give a brief introduction to what ontology is and how it can be used as a basis to
automate information understanding, search, and exchange.
The study of ontologies attempts to describe the structures and relationships that
are possible in reality through some common vocabulary; therefore, it can be con-
sidered as a way to describe the knowledge of a certain community about reality.
Ontology originated in the fields of philosophy and metaphysics. One commonly
used definition of ontology is a specification of a conceptualization.13
In this definition, a conceptualization is the set of concepts that are used to repre-
sent the part of reality or knowledge that is of interest to a community of users.
Specification refers to the language and vocabulary terms that are used to specify
the conceptualization. The ontology includes both specification and
conceptualization. For example, the same conceptualization may be specified in two
different languages, giving two separate ontologies. Based on this quite general def-
inition, there is no consensus on what an ontology is exactly. Some possible ways to
describe ontologies are as follows:
■ A thesaurus (or even a dictionary or a glossary of terms) describes the rela-
tionships between words (vocabulary) that represent various concepts.
■ A taxonomy describes how concepts of a particular area of knowledge are
related using structures similar to those used in a specialization or general-
ization.
■ A detailed database schema is considered by some to be an ontology that
describes the concepts (entities and attributes) and relationships of a mini-
world from reality.
■ A logical theory uses concepts from mathematical logic to try to define con-
cepts and their interrelationships.
Usually the concepts used to describe ontologies are quite similar to the concepts we
discussed in conceptual modeling, such as entities, attributes, relationships, special-
izations, and so on. The main difference between an ontology and, say, a database
schema, is that the schema is usually limited to describing a small subset of a mini-
13This definition is given in Gruber (1995).

Review Questions
273
world from reality in order to store and manage data. An ontology is usually consid-
ered to be more general in that it attempts to describe a part of reality or a domain
of interest (for example, medical terms, electronic-commerce applications, sports,
and so on) as completely as possible.
8.8 Summary
In this chapter we discussed extensions to the ER model that improve its representa-
tional capabilities. We called the resulting model the enhanced ER or EER model.
We presented the concept of a subclass and its superclass and the related mechanism
of attribute/relationship inheritance. We saw how it is sometimes necessary to create
additional classes of entities, either because of additional specific attributes or
because of specific relationship types. We discussed two main processes for defining
superclass/subclass hierarchies and lattices: specialization and generalization.
Next, we showed how to display these new constructs in an EER diagram. We also
discussed the various types of constraints that may apply to specialization or gener-
alization. The two main constraints are total/partial and disjoint/overlapping. In
addition, a defining predicate for a subclass or a defining attribute for a specializa-
tion may be specified. We discussed the differences between user-defined and
predicate-defined subclasses and between user-defined and attribute-defined spe-
cializations. Finally, we discussed the concept of a category or union type, which is a
subset of the union of two or more classes, and we gave formal definitions of all the
concepts presented.
We introduced some of the notation and terminology of UML for representing spe-
cialization and generalization. In Section 8.7 we briefly discussed the discipline of
knowledge representation and how it is related to semantic data modeling. We also
gave an overview and summary of the types of abstract data representation con-
cepts: classification and instantiation, identification, specialization and generaliza-
tion, and aggregation and association. We saw how EER and UML concepts are
related to each of these.
Review Questions
8.1. What is a subclass? When is a subclass needed in data modeling?
8.2. Define the following terms: superclass of a subclass, superclass/subclass rela-
tionship, IS-A relationship, specialization, generalization, category, specific
(local) attributes, and specific relationships.
8.3. Discuss the mechanism of attribute/relationship inheritance. Why is it use-
ful?
8.4. Discuss user-defined and predicate-defined subclasses, and identify the dif-
ferences between the two.
8.5. Discuss user-defined and attribute-defined specializations, and identify the
differences between the two.

274
Chapter 8 The Enhanced Entity-Relationship (EER) Model
8.6.
Discuss the two main types of constraints on specializations and generaliza-
tions.
8.7.
What is the difference between a specialization hierarchy and a specialization
lattice?
8.8.
What is the difference between specialization and generalization? Why do we
not display this difference in schema diagrams?
8.9.
How does a category differ from a regular shared subclass? What is a cate-
gory used for? Illustrate your answer with examples.
8.10.
For each of the following UML terms (see Sections 7.8 and 8.6), discuss the
corresponding term in the EER model, if any: object, class, association,
aggregation, generalization, multiplicity, attributes, discriminator, link, link
attribute, reflexive association, and qualified association.
8.11.
Discuss the main differences between the notation for EER schema diagrams
and UML class diagrams by comparing how common concepts are repre-
sented in each.
8.12.
List the various data abstraction concepts and the corresponding modeling
concepts in the EER model.
8.13.
What aggregation feature is missing from the EER model? How can the EER
model be further enhanced to support it?
8.14.
What are the main similarities and differences between conceptual database
modeling techniques and knowledge representation techniques?
8.15.
Discuss the similarities and differences between an ontology and a database
schema.
Exercises
8.16. Design an EER schema for a database application that you are interested in.
Specify all constraints that should hold on the database. Make sure that the
schema has at least five entity types, four relationship types, a weak entity
type, a superclass/subclass relationship, a category, and an n-ary (n > 2) rela-
tionship type.
8.17. Consider the BANK ER schema in Figure 7.21, and suppose that it is neces-
sary to keep track of different types of ACCOUNTS (SAVINGS_ACCTS,
CHECKING_ACCTS, …) and LOANS (CAR_LOANS, HOME_LOANS, …).
Suppose that it is also desirable to keep track of each ACCOUNT’s
TRANSACTIONS (deposits, withdrawals, checks, …) and each LOAN’s
PAYMENTS; both of these include the amount, date, and time. Modify the
BANK schema, using ER and EER concepts of specialization and generaliza-
tion. State any assumptions you make about the additional requirements.

Exercises
275
8.18.
The following narrative describes a simplified version of the organization of
Olympic facilities planned for the summer Olympics. Draw an EER diagram
that shows the entity types, attributes, relationships, and specializations for
this application. State any assumptions you make. The Olympic facilities are
divided into sports complexes. Sports complexes are divided into one-sport
and multisport types. Multisport complexes have areas of the complex desig-
nated for each sport with a location indicator (e.g., center, NE corner, and so
on). A complex has a location, chief organizing individual, total occupied
area, and so on. Each complex holds a series of events (e.g., the track stadium
may hold many different races). For each event there is a planned date, dura-
tion, number of participants, number of officials, and so on. A roster of all
officials will be maintained together with the list of events each official will
be involved in. Different equipment is needed for the events (e.g., goal posts,
poles, parallel bars) as well as for maintenance. The two types of facilities
(one-sport and multisport) will have different types of information. For
each type, the number of facilities needed is kept, together with an approxi-
mate budget.
8.19.
Identify all the important concepts represented in the library database case
study described below. In particular, identify the abstractions of classifica-
tion (entity types and relationship types), aggregation, identification, and
specialization/generalization. Specify (min, max) cardinality constraints
whenever possible. List details that will affect the eventual design but that
have no bearing on the conceptual design. List the semantic constraints sep-
arately. Draw an EER diagram of the library database.
Case Study: The Georgia Tech Library (GTL) has approximately 16,000
members, 100,000 titles, and 250,000 volumes (an average of 2.5 copies per
book). About 10 percent of the volumes are out on loan at any one time. The
librarians ensure that the books that members want to borrow are available
when the members want to borrow them. Also, the librarians must know how
many copies of each book are in the library or out on loan at any given time.
A catalog of books is available online that lists books by author, title, and sub-
ject area. For each title in the library, a book description is kept in the catalog
that ranges from one sentence to several pages. The reference librarians want
to be able to access this description when members request information
about a book. Library staff includes chief librarian, departmental associate
librarians, reference librarians, check-out staff, and library assistants.
Books can be checked out for 21 days. Members are allowed to have only five
books out at a time. Members usually return books within three to four
weeks. Most members know that they have one week of grace before a notice
is sent to them, so they try to return books before the grace period ends.
About 5 percent of the members have to be sent reminders to return books.
Most overdue books are returned within a month of the due date.
Approximately 5 percent of the overdue books are either kept or never
returned. The most active members of the library are defined as those who

276
Chapter 8 The Enhanced Entity-Relationship (EER) Model
borrow books at least ten times during the year. The top 1 percent of mem-
bership does 15 percent of the borrowing, and the top 10 percent of the
membership does 40 percent of the borrowing. About 20 percent of the
members are totally inactive in that they are members who never borrow.
To become a member of the library, applicants fill out a form including their
SSN, campus and home mailing addresses, and phone numbers. The librari-
ans issue a numbered, machine-readable card with the member’s photo on
it. This card is good for four years. A month before a card expires, a notice is
sent to a member for renewal. Professors at the institute are considered auto-
matic members. When a new faculty member joins the institute, his or her
information is pulled from the employee records and a library card is mailed
to his or her campus address. Professors are allowed to check out books for
three-month intervals and have a two-week grace period. Renewal notices to
professors are sent to their campus address.
The library does not lend some books, such as reference books, rare books,
and maps. The librarians must differentiate between books that can be lent
and those that cannot be lent. In addition, the librarians have a list of some
books they are interested in acquiring but cannot obtain, such as rare or out-
of-print books and books that were lost or destroyed but have not been
replaced. The librarians must have a system that keeps track of books that
cannot be lent as well as books that they are interested in acquiring. Some
books may have the same title; therefore, the title cannot be used as a means
of identification. Every book is identified by its International Standard Book
Number (ISBN), a unique international code assigned to all books. Two
books with the same title can have different ISBNs if they are in different lan-
guages or have different bindings (hardcover or softcover). Editions of the
same book have different ISBNs.
The proposed database system must be designed to keep track of the mem-
bers, the books, the catalog, and the borrowing activity.
8.20.
Design a database to keep track of information for an art museum. Assume
that the following requirements were collected:
■ The museum has a collection of ART_OBJECTS. Each ART_OBJECT has a
unique Id_no, an Artist (if known), a Year (when it was created, if known),
a Title, and a Description. The art objects are categorized in several ways, as
discussed below.
■ ART_OBJECTS are categorized based on their type. There are three main
types: PAINTING, SCULPTURE, and STATUE, plus another type called
OTHER to accommodate objects that do not fall into one of the three
main types.
■ A PAINTING has a Paint_type (oil, watercolor, etc.), material on which it is
Drawn_on (paper, canvas, wood, etc.), and Style (modern, abstract, etc.).
■ A SCULPTURE or a statue has a Material from which it was created (wood,
stone, etc.), Height, Weight, and Style.

Exercises
277
■ An art object in the OTHER category has a Type (print, photo, etc.) and
Style.
■ ART_OBJECTs are categorized as either PERMANENT_COLLECTION
(objects that are owned by the museum) and BORROWED. Information
captured about objects in the PERMANENT_COLLECTION includes
Date_acquired, Status
(on display, on loan, or stored), and Cost.
Information captured about BORROWED objects includes the Collection
from which it was borrowed, Date_borrowed, and Date_returned.
■ Information describing the country or culture of Origin (Italian, Egyptian,
American, Indian, and so forth) and Epoch (Renaissance, Modern,
Ancient, and so forth) is captured for each ART_OBJECT.
■ The museum keeps track of ARTIST information, if known: Name,
DateBorn (if known), Date_died (if not living), Country_of_origin, Epoch,
Main_style, and Description. The Name is assumed to be unique.
■ Different EXHIBITIONS occur, each having a Name, Start_date, and
End_date. EXHIBITIONS are related to all the art objects that were on dis-
play during the exhibition.
■ Information is kept on other COLLECTIONS with which the museum
interacts, including Name (unique), Type (museum, personal, etc.),
Description, Address, Phone, and current Contact_person.
Draw an EER schema diagram for this application. Discuss any assumptions
you make, and that justify your EER design choices.
8.21.
Figure 8.12 shows an example of an EER diagram for a small private airport
database that is used to keep track of airplanes, their owners, airport
employees, and pilots. From the requirements for this database, the follow-
ing information was collected: Each AIRPLANE has a registratioumber
[Reg#], is of a particular plane type [OF_TYPE], and is stored in a particular
hangar [STORED_IN]. Each PLANE_TYPE has a model number [Model], a
capacity [Capacity], and a weight [Weight]. Each HANGAR has a number
[Number], a capacity [Capacity], and a location [Location]. The database also
keeps track of the OWNERs of each plane [OWNS] and the EMPLOYEEs who
have maintained the plane [MAINTAIN]. Each relationship instance in OWNS
relates an AIRPLANE to an OWNER and includes the purchase date [Pdate].
Each relationship instance in MAINTAIN relates an EMPLOYEE to a service
record [SERVICE]. Each plane undergoes service many times; hence, it is
related by [PLANE_SERVICE] to a number of SERVICE records. A SERVICE
record includes as attributes the date of maintenance [Date], the number of
hours spent on the work [Hours], and the type of work done [Work_code].
We use a weak entity type [SERVICE] to represent airplane service, because
the airplane registratioumber is used to identify a service record. An
OWNER is either a person or a corporation. Hence, we use a union type (cat-
egory)
[OWNER] that is a subset of the union of corporation
[CORPORATION] and person [PERSON] entity types. Both pilots [PILOT]
and employees [EMPLOYEE] are subclasses of PERSON. Each PILOT has

278
Chapter 8 The Enhanced Entity-Relationship (EER) Model
Sala ry
Shift
Model Capacity
Weight
M
N
WORKS_ON
EMPLOYEE
N
MAINTAIN
PLANE_TYPE
Restr
Lic_num
M
1
M
N
FLIES
PILOT
OF_TYPE
Date
Workcode
N
Date/workcode
SERVICE
Hours
Reg#
N
1
AIRPLANE
PLANE_SERVICE
N
Pdate
STORED_IN
OWNS
OWNER
M
N
1
U
HANGAR
CORPORATION
Ssn
PERSON
Number
Location
Name
Phone
Name
Phone
Capacity
Add ress
Add ress
Figure 8.12
EER schema for a SMALL_AIRPORT database.
specific attributes license number [Lic_num] and restrictions [Restr]; each
EMPLOYEE has specific attributes salary [Salary] and shift worked [Shift]. All
PERSON entities in the database have data kept on their Social Security
number [Ssn], name [Name], address [Address], and telephone number
[Phone]. For CORPORATION entities, the data kept includes name [Name],
address [Address], and telephone number [Phone]. The database also keeps
track of the types of planes each pilot is authorized to fly [FLIES] and the
types of planes each employee can do maintenance work on [WORKS_ON].

Exercises
279
Show how the SMALL_AIRPORT EER schema in Figure 8.12 may be repre-
sented in UML notation. (Note: We have not discussed how to represent cat-
egories (union types) in UML, so you do not have to map the categories in
this and the following question.)
8.22. Show how the UNIVERSITY EER schema in Figure 8.9 may be represented in
UML notation.
8.23. Consider the entity sets and attributes shown in the table below. Place a
checkmark in one column in each row to indicate the relationship between
the far left and right columns.
a. The left side has a relationship with the right side.
b. The right side is an attribute of the left side.
c. The left side is a specialization of the right side.
d. The left side is a generalization of the right side.
(a) Has a
(b) Has an
(c) Is a
(d) Is a
Relationship
Attribute
Specialization
Generalization
Entity Set
Entity Set
with
that is
of
of
or Attribute
1.
MOTHER
PERSON
2.
DAUGHTER
MOTHER
3.
STUDENT
PERSON
4.
STUDENT
Student_id
5.
SCHOOL
STUDENT
6.
SCHOOL
CLASS_ROOM
7.
ANIMAL
HORSE
8.
HORSE
Breed
9.
HORSE
Age
10.
EMPLOYEE
SSN
11.
FURNITURE
CHAIR
12.
CHAIR
Weight
13.
HUMAN
WOMAN
14.
SOLDIER
PERSON
15.
ENEMY_COMBATANT
PERSON
8.24. Draw a UML diagram for storing a played game of chess in a database. You
may look at http://www.chessgames.com for an application similar to what
you are designing. State clearly any assumptions you make in your UML dia-
gram. A sample of assumptions you can make about the scope is as follows:
1. The game of chess is played between two players.
2. The game is played on an 8 × 8 board like the one shown below:

280
Chapter 8 The Enhanced Entity-Relationship (EER) Model
3. The players are assigned a color of black or white at the start of the game.
4. Each player starts with the following pieces (traditionally called chess-
men):
a. king
d.
2 bishops
b. queen
e.
2 knights
c.
2 rooks
f.
8 pawns
5. Every piece has its own initial position.
6. Every piece has its own set of legal moves based on the state of the game.
You do not need to worry about which moves are or are not legal except
for the following issues:
a. A piece may move to an empty square or capture an opposing piece.
b. If a piece is captured, it is removed from the board.
c. If a pawn moves to the last row, it is “promoted” by converting it to
another piece (queen, rook, bishop, or knight).
Note: Some of these functions may be spread over multiple classes.
8.25.
Draw an EER diagram for a game of chess as described in Exercise 8.24.
Focus on persistent storage aspects of the system. For example, the system
would need to retrieve all the moves of every game played in sequential
order.
8.26.
Which of the following EER diagrams is/are incorrect and why? State clearly
any assumptions you make.
a.
E1
N
E
o
R
E3
1
E2
b.
E1
1
E
d
R
1
E2

Laboratory Exercises
281
c.
o
E1
E3
M
N
R
8.27.
Consider the following EER diagram that describes the computer systems at
a company. Provide your own attributes and key for each entity type. Supply
max cardinality constraints justifying your choice. Write a complete narra-
tive description of what this EER diagram represents.
INSTALLED
SOLD_WITH
SOFTWARE
COMPUTER
INSTALLED_OS
OPERATING_
d
SYSTEM
LAPTOP
DESKTOP
OPTIONS
COMPONENT
MEM_OPTIONS
SUPPORTS
ACCESSORY
d
d
MEMORY
VIDEO_CARD
SOUND_CARD
KEYBOARD
MOUSE
MONITOR
Laboratory Exercises
8.28. Consider a GRADE_BOOK database in which instructors within an academic department record
points earned by individual students in their classes. The data requirements are summarized as
follows:
■ Each student is identified by a unique identifier, first and last name, and an e-mail address.
■ Each instructor teaches certain courses each term. Each course is identified by a course num-
ber, a sectioumber, and the term in which it is taught. For each course he or she teaches, the

282
Chapter 8 The Enhanced Entity-Relationship (EER) Model
instructor specifies the minimum number of points required in order to
earn letter grades A, B, C, D, and F. For example, 90 points for an A, 80
points for a B, 70 points for a C, and so forth.
■ Students are enrolled in each course taught by the instructor.
■ Each course has a number of grading components (such as midterm
exam, final exam, project, and so forth). Each grading component has a
maximum number of points (such as 100 or 50) and a weight (such as
20% or 10%). The weights of all the grading components of a course usu-
ally total 100.
■ Finally, the instructor records the points earned by each student in each of
the grading components in each of the courses. For example, student
1234 earns 84 points for the midterm exam grading component of the
section 2 course CSc2310 in the fall term of 2009. The midterm exam
grading component may have been defined to have a maximum of 100
points and a weight of 20% of the course grade.
Design an Enhanced Entity-Relationship diagram for the grade book data-
base and build the design using a data modeling tool such as ERwin or
Rational Rose.
8.29.
Consider an ONLINE_AUCTION database system in which members (buyers
and sellers) participate in the sale of items. The data requirements for this
system are summarized as follows:
■ The online site has members, each of whom is identified by a unique
member number and is described by an e-mail address, name, password,
home address, and phone number.
■ A member may be a buyer or a seller. A buyer has a shipping address
recorded in the database. A seller has a bank account number and routing
number recorded in the database.
■ Items are placed by a seller for sale and are identified by a unique item
number assigned by the system. Items are also described by an item title, a
description, starting bid price, bidding increment, the start date of the
auction, and the end date of the auction.
■ Items are also categorized based on a fixed classification hierarchy (for
example, a modem may be classified as COMPUTER→HARDWARE
→MODEM).
■ Buyers make bids for items they are interested in. Bid price and time of
bid is recorded. The bidder at the end of the auction with the highest bid
price is declared the winner and a transaction between buyer and seller
may then proceed.
■ The buyer and seller may record feedback regarding their completed
transactions. Feedback contains a rating of the other party participating
in the transaction (1-10) and a comment.

Laboratory Exercises
283
Design an Enhanced Entity-Relationship diagram for the ONLINE_AUCTION
database and build the design using a data modeling tool such as ERwin or
Rational Rose.
8.30.
Consider a database system for a baseball organization such as the major
leagues. The data requirements are summarized as follows:
■
The personnel involved in the league include players, coaches, managers,
and umpires. Each is identified by a unique personnel id. They are also
described by their first and last names along with the date and place of
birth.
■
Players are further described by other attributes such as their batting ori-
entation (left, right, or switch) and have a lifetime batting average (BA).
■
Within the players group is a subset of players called pitchers. Pitchers
have a lifetime ERA (earned run average) associated with them.
■
Teams are uniquely identified by their names. Teams are also described by
the city in which they are located and the division and league in which
they play (such as Central division of the American League).
■
Teams have one manager, a number of coaches, and a number of players.
■
Games are played between two teams with one designated as the home
team and the other the visiting team on a particular date. The score (runs,
hits, and errors) are recorded for each team. The team with the most runs
is declared the winner of the game.
■
With each finished game, a winning pitcher and a losing pitcher are
recorded. In case there is a save awarded, the save pitcher is also recorded.
■
With each finished game, the number of hits (singles, doubles, triples, and
home runs) obtained by each player is also recorded.
Design an Enhanced Entity-Relationship diagram for the BASEBALL data-
base and enter the design using a data modeling tool such as ERwin or
Rational Rose.
8.31.
Consider the EER diagram for the UNIVERSITY database shown in Figure
8.9. Enter this design using a data modeling tool such as ERwin or Rational
Rose. Make a list of the differences iotation between the diagram in the
text and the corresponding equivalent diagrammatic notation you end up
using with the tool.
8.32.
Consider the EER diagram for the small AIRPORT database shown in Figure
8.12. Build this design using a data modeling tool such as ERwin or Rational
Rose. Be careful as to how you model the category OWNER in this diagram.
(Hint: Consider using CORPORATION_IS_OWNER and PERSON_IS_
OWNER as two distinct relationship types.)
8.33.
Consider the UNIVERSITY database described in Exercise 7.16. You already
developed an ER schema for this database using a data modeling tool such as

284
Chapter 8 The Enhanced Entity-Relationship (EER) Model
ERwin or Rational Rose in Lab Exercise 7.31. Modify this diagram by classi-
fying COURSES as either UNDERGRAD_COURSES or GRAD_COURSES
and INSTRUCTORS as either JUNIOR_PROFESSORS or
SENIOR_PROFESSORS. Include appropriate attributes for these new entity
types. Then establish relationships indicating that junior instructors teach
undergraduate courses while senior instructors teach graduate courses.
Selected Bibliography
Many papers have proposed conceptual or semantic data models. We give a repre-
sentative list here. One group of papers, including Abrial (1974), Senko’s DIAM
model (1975), the NIAM method (Verheijen and VanBekkum 1982), and Bracchi et
al. (1976), presents semantic models that are based on the concept of binary rela-
tionships. Another group of early papers discusses methods for extending the rela-
tional model to enhance its modeling capabilities. This includes the papers by
Schmid and Swenson (1975), Navathe and Schkolnick (1978), Codd’s RM/T model
(1979), Furtado (1978), and the structural model of Wiederhold and Elmasri
(1979).
The ER model was proposed originally by Chen (1976) and is formalized in Ng
(1981). Since then, numerous extensions of its modeling capabilities have been pro-
posed, as in Scheuermann et al. (1979), Dos Santos et al. (1979), Teorey et al. (1986),
Gogolla and Hohenstein (1991), and the Entity-Category-Relationship (ECR)
model of Elmasri et al. (1985). Smith and Smith (1977) present the concepts of gen-
eralization and aggregation. The semantic data model of Hammer and McLeod
(1981) introduced the concepts of class/subclass lattices, as well as other advanced
modeling concepts.
A survey of semantic data modeling appears in Hull and King (1987). Eick (1991)
discusses design and transformations of conceptual schemas. Analysis of con-
straints for n-ary relationships is given in Soutou (1998). UML is described in detail
in Booch, Rumbaugh, and Jacobson (1999). Fowler and Scott (2000) and Stevens
and Pooley (2000) give concise introductions to UML concepts.
Fensel (2000, 2003) discuss the Semantic Web and application of ontologies.
Uschold and Gruninger (1996) and Gruber (1995) discuss ontologies. The June
2002 issue of Communications of the ACM is devoted to ontology concepts and
applications. Fensel (2003) is a book that discusses ontologies and e-commerce.

chapter
9
Relational Database
Design by ER- and
EER-to-Relational Mapping
his chapter discusses how to design a relational
T
database schema based on a conceptual schema
design. Figure 7.1 presented a high-level view of the database design process, and in
this chapter we focus on the logical database design or data model mapping step of
database design. We present the procedures to create a relational schema from an
Entity-Relationship (ER) or an Enhanced ER (EER) schema. Our discussion relates
the constructs of the ER and EER models, presented in Chapters 7 and 8, to the con-
structs of the relational model, presented in Chapters 3 through 6. Many computer-
aided software engineering (CASE) tools are based on the ER or EER models, or
other similar models, as we have discussed in Chapters 7 and 8. Many tools use ER
or EER diagrams or variations to develop the schema graphically, and then convert
it automatically into a relational database schema in the DDL of a specific relational
DBMS by employing algorithms similar to the ones presented in this chapter.
We outline a seven-step algorithm in Section 9.1 to convert the basic ER model con-
structs—entity types (strong and weak), binary relationships (with various struc-
tural constraints), n-ary relationships, and attributes (simple, composite, and
multivalued)—into relations. Then, in Section 9.2, we continue the mapping algo-
rithm by describing how to map EER model constructs—specialization/generaliza-
tion and union types (categories)—into relations. Section 9.3 summarizes the
chapter.
285