Lecture 7
Query Optimization
Interpretation and Optimization of Queries. Algebraic Techniques for Query Transformation The Initial Relational Algebra Expression and Query Tree


Theoretical Knowledge
    Evaluating Conjunctive Conditions
    Heuristics For Query Optimization
Practical Knowledge
    Problems


Problem: Find the schedules and rooms of all courses taken by any mathematics major

SQL:

select shed, room

from class, enrolment, student

where major = 'math' and

Relational Algebra:

join student, enrolment giving temp1

join temp1, class giving temp2

select temp2 where major = 'math' giving temp3

project temp3 over shed, room

p sched, room (s major = 'math' (( student * enrolment) * class)) 

Relational Algebra:

select student where major = 'math' giving temp1

join temp1, enrolment giving temp2

joint temp2, class giving temp3

project temp3 over shed, room

p sched, room (((ss major = 'math' (student)) * enrolment) * class)

Evaluating Conjunctive Conditions

Problem: Find the schedules and rooms of courses taken by those math majors who have more than 100 credits.

select student where major = 'math' and credits > 100 giving t1

join t1, enrolment giving t2

join t2, class giving t3

project t3 over sched, room

p sched, room (((s major = 'math" and credits >100 (student)) * enrolment) * class)

Problem: Find the schedules and rooms for all mathematics majors who have received a grade of f.

join student, enrolment giving t1

select t1 where major = 'math' and grade = 'f' giving t2

join t2, class giving t3

project t3 over sched, room

RA:

p sched, room ((s major = 'math" and grade = 'f' (student * enrolment)) * class)

  

join student, enrolment giving t1

select t1 where grade = 'f' giving t2

select t2 where major = 'math' giving t3

 

s major = 'math' (s grade ='f' (student * enrolment))

  

select student where major = 'math' giving t1

select enrolment where grade = 'f' giving t2

join t1, t2 giving t3

join t3, class giving t4

project t4 over sched, room

p sched, room (((s major = 'math" (student)) * (s grade = 'f' (enrolment))) * class)

Heuristics For Query Optimization

1. Do selection as early as possible. If appropriate, use cascading of selects, commutativity of selects with projects and products, and distributivity of select over set union, set intersection, and set difference to move the selection as far down the query tree as possible.

2. Do projection early. If appropriate, use cascading of projects, distributivity of projection over product and set union, intersection, and difference, and commutativity of selection and projection to move the projection as far down the query tree as possible. Examine all projections to see if some are unnecessary.

3. If a product appears as an argument for a selection, where the selection involves attributes of the tables in the product, transform the product into a join. If the selection involves attributes of only one of the tables in the product, apply the selection to that table first.

4. If there is a sequence of selections and/or projections, use commutativity or cascading to combine them into one selection, one projection, or a selection followed by a projection and apply them simultaneously to each tuple as it is retrieved. If a join or product is to be followed by a sequence of selections or projections, apply them to each tuple of the join or product as soon as it is constructed.

5. Use associativity of join, product, union, and intersection to rearrange relations in the query tree so that the selection operation that will produce the smallest table will be executed first.

6. If a subexpression appears more than once in the query tree, and the result it produces is not too large, compute it once and save it. This is especially useful when querying on views, since the same subexpression must be used to construct the view each time.

 


Practical Knowledge

Problem 1. Consider a car service database scheme

customer (name, address, city, customer_number)

vehicle (model, make, car_plate)

service (customer_number, car_plate, date)

And a relational algebra expression of the form

p model (s date = 1/11/92 (p model, make, car_plate, name, address, city, customer_number, date (pcustomer.customer_number =service.customer_number and vehicle.car_plate = service.car_plate (customer x vehicle x service))))

 

Draw the query tree. Then show the optimization process for this tree by drawing intermediate query trees. (You may combine steps into pairs to save on drawing; however with each optimization step you must explain exactly what you did and why.)

[Top][Learning Units]
[Previous][Next]