Lecture 4
RDB Design
Functional Dependencies and Normalization for Relational Databases


Theoretical Knowledge
    Informal Design Guidelines for Relation Schema
    Functional Dependencies
    First Normal Form
    Full Functional Dependency
    Second Normal Form
    Boyce-Codd Normal Form
    Third Normal Form
    The Normalization Process
    Databases - Physical Design
    Properties of a Transaction
    Problems with DBMS Operations
Practical Knowledge
    RDB Design
    Exercises


Informal Design Guidelines for Relation Schema


Functional Dependencies

Definition : If a and b are attributes or sets of attributes of relation R, we say that b is Functionally Dependent on a if each value of a in R has associated with it exactly one value of b in R( a ® b)

When a functional dependency exists, the attribute or set of attributes on the left side of the arrow is called a determinant.

Student Table (Each student has at most one major)

stuid stuname major credits status socsecno
s1001 Smith, Tom history 90 senion 100429500
s1003 Jones, Mary math 95 senior 010124567
s1006 Lee, Pamela csc 15 freshman 088520876
s1010 Burns, Edward art 63 junior 099320985
s1060 Jones, Mary csc 25 freshman 064624738

student( stuid, stuname, major, credits, status, socsecno )

stuid ® stuname

stuid ® stuname, major, credits, status, socsecno

sosecno ® stuid, stuname, major, credits, status

credits ® statu

Superkeys, Candidate Keys, and Primary Keys

Superkey is an attribute or a set of attributes that identifies an entity uniquely.

{stuid}, { stuid, any other attribute} {socsecno, and other attribute} 

Candidate key is a superkey such that no proper subset of its attributes is itself a superkey

{stuid}, {socsecno}, {stuname, major} 

Primary key is a candidate key that is used to identify tuples in a relation. None of its attributes may have null values.

{stuid}, {socsecno} 


First Normal Form (1NF)

Definition : A relation is in First Normal Form (1NF) if and only if every attribute is single-valued for each tuple.

 

 stuid stuname major credits status socsecno
s1001 Smith, Tom history 90 sen 100429500
s1003 Jones, Mary math 95 sen 010124567
s1006 Lee, Pamela csc

math

15 fresh 088520876
s1010 Burns, Edward art

english

63 jun 099320985
s1060 Jones, Mary csc 25 fresh 064624738

Student table (students may have more than one major)


 stuid stuname major credits status socsecno
s1001 Smith, Tom history 90 sen 100429500
s1003 Jones, Mary math 95 sen 010124567
s1006 Lee, Pamela csc 15 fresh 088520876
s1006 Lee, Pamela math 15 fresh 088520876
s1010 Burns, Edward art 63 jun 099320985
s1010 Burns, Edward english 63 jun 099320985
s1060 Jones, Mary csc 25 fresh 064624738

Student table rewritten in 1NF

{stuid, major}


Full Functional Dependency

Definition: In a relation R, attribute b of R is Fully Functionally Dependent on an attribute or set of attributes a of R if b is functionally dependent on a but not functionally dependent on any proper subset of a.

class( course#, stuid, stuname, facid, sched, room, grade )

 course# stuid stuname facid sched room grade
art103a s1001 Smith, Tom f101 mwf9 h221 a
art103a s1010 Burns, Edward f101 mwf9 h221  
art103a s1006 Lee, Pamela f101 mwf9 h221 b
csc201a s1003 Jones, Marry f105 tuthf10 m110 a
csc201a s1006 Lee, Pamela f105 tuthf10 m110 c
hst205a s1001 Smith, Tom f202 mwf11 h221  

Class Table

fd:

course#, stuid ® stuname, facid, sched, room, grade

course#, stuid - composite key

fd:

course# ® facid, sched, room

stuid ® stuname

Stuname is fd on course#, stuid, it is also fd of stuid


Second Normal Form (2NF)

Definition : A relation is in Second Normal Form (2NF) if and only if it is in first normal form and all the nonkey attributes are fully functionally dependent on the key.

class (course, stdid, stuname, facid, sched, room, grade )

Class Table

 course# stuid stuname facid sched room grade
art103a s1001 Smith, Tom f101 mwf9 h221 a
art103a s1010 Burns, Edward f101 mwf9 h221  
art103a s1006 Lee, Pamela f101 mwf9 h221 b
csc201a s1003 Jones, Marry f105 tuthf10 m110 a
csc201a s1006 Lee, Pamela f105 tuthf10 m110 c
hst205a s1001 Smith, Tom f202 mwf11 h221  

course # ® facid, sched, room

stuid ® stuname

course#, stuid ® grade ( and, of course, facid, sced, room, stuname ); 

Using projection we break up the class into:

class2(course#, stuid, grade)

couse(course#, facid, shed, room)

stu(stuid, stuname)


 course# stuid grade
art103a s1001 a
art103a s1010  
art103a s1006 b
csc201a s1003 a
csc201a s1006 c
hst205a s1001  

Class2 Table


stuid stuname
s1001 Smith, Tom
s1010 Burns, Edward
s1006 Lee, Pamela
s1003 Jones, Mary

Stu Table


 course# facid sched room
art103a f101 mwf9 h221
csc210a f105 tuthf10 m110
hst205a f202 mwf11 h221

Course Table


Transitive dependency

student (stuid, stuname, major, credits, status )

 stuid stuname major credits status
s1001 Smith, Tom history 90 senior
s1003 Jones, Mary math 95 senior
s1006 Lee, Pamela csc 15 freshman
s1010 Burns, Edward art 63 junior
s1060 Joenes, Mary csc 25 freshman

stuid ® credits

credits ® status

(stuid ® credits ) and (credits ® status) => (stuid ® status)


Third Normal Form (3NF)

Definition: A relation is in Third Normal Form (3NF) if it is in second normal form and no nonkey attribute is transitively dependent on the key.


Student Table

stuid stuname major credits status
s1001 Smith, Tom history 90 senior
s1003 Jones, Mary math 95 senior
s1006 Lee, Pamela csc 15 freshman
s1010 Burns, Edward art 63 junior
s1060 Joenes, Mary csc 25 freshman


Stu2 Table

stuid stuname major credits
s1001 Smith, Tom history 90
s1003 Jones, Mary math 95
s1006 Lee, Pamela csc 15
s1010 Burns, Edward art 63
s1060 Joenes, Mary csc 25


Stats Table

credits status
15 freshman
25 freshman
63 junior
90 senior
95 senior


Boyce-Codd Normal Form (BCNF)

Definition: A relation is in Boyce-Codd Normal Form (BCNF) in and only if every determinant is a candidate key.

student(stuid, stuname, major, credits, status)

determinents: stuid, credits

Credits is not a candidate key - the relation is not BCNF.

faculty (facname, dept, office, rank, datehired ) 

office ® dept

facname, dept ® office, rank, datehired

facname, office ® dept, rank, datehired

fac1 (dept, office)

fac2 (facname, office, rank, datehired) 

stu (stuname, stuadd, major, gpa)


The Normalization Process

Analysis

Synthesis


Databases - Physical Design

1. How are data to be actually stored on physical devices?

2. How are data to be accessed and/or modified?

Considerations:

The physical storage designer may or may not be able to choose the data item storage technique.

Specific options available will depend on dbms being used.

Data item storage techniques

Transactions

A single operation as viewed by user involves more than one task 

Eg. Update grade of student in relation

Tasks: 

...Such a program unit is called a transaction


Properties of a Transaction ('ACID' Tests)

Atomicity

Consistency

Isolation

Durability


Efficiency in transactions achieved by: 


Problems with DBMS Operations

Integrity

Security

Deadlock 

Recovery 

'Rollback'='Windback' process until its activities with database are sound

 


Practical Knowledge 

RDB Design


Example 

The Teachers Relation 

tutor course# room rsize en_limit
Smith 353 a532 45 40
 Smith 351 c320 100 60
Clark 355 h940 400 300
Turner 456 b278 50 45
Jameson 459 d110 50 45

 This is un-normalised because it allows a multiple set of values in column 'course#'

 Decompose to 1NF(ie atomic values only)

course# tutor room rsize en_limit
353 Smith a532 45 40
351 Smith c320 100 60
355 Clark h940 400 300
456 Turner b278 50 45
459 Jameson d110 50 45


However, There May be Problems: 

1. Insertion Anomaly

Room size cannot be entered in the database unless a course is scheduled in that room 

2. Deletion Anomaly 

Rsize of a room in which only one course is scheduled will be deleted if that course is deleted 

3. Update Anomaly 

There is nothing to prevent the same room appearing more than once in the database. Inconsistencies could arise between entries 'room' and 'rsize' if updating is careless 

Rsize is not fully functionally dependent on course#

(Dependent on room)


Decompose to 2NF

 course_details

course# tutor en_limit
353 Smith 40
351 Smith 60
355 Clark 300
456 Turner 45
459 Jameson 45

Room_details 

room rsize
a532 45
c320 100
h940 400
b278 50
d110 50

 There is a transitive dependence between room and en_limit>


Decompose 'course_details' to 3NF:

course# tutor
353 Smith
351 Smith
355 Clark
456 Turner
459 Jameson

course# en_limit
353 40
3517 60
355 300
456 45
459 45

Course

employee_id course_id fee
e130 c200 75
e200 c300 100
e250 c200 75
e425 c400 150
e500 c300 100
e575 c500 50

Redundancy 

Insertion anomaly 

Deletion anomaly 

Update anomaly 

Employee_course 

employee_id course_id
e130 c200
e200 c300
e250 c200
e425 c400
e500 c300
e575 c500

course_id fee
c200 75
c300 100
c400 150
c500 50

Referential integrity  So:

Exercises:

For the example from the previous time:

1. Using these assumptions and stating any others you need to make, list all the nontrivial functional dependencies for this relation.

2. What are the candidate keys for this relation? Identify the primary key.

3. Is the relation in Third Normal Form? If not, find a 3NF lossless join decomposition of college that preserves dependencies.

4. Is the relation or resulting set of relations in Boyce-Codd Normal Form? If not, find a lossless join decomposition that is in BCNF, if possible.

[Top][Learning Units]
[Previous][Next]