+1(978)310-4246 credencewriters@gmail.com
Select Page

COP 4710
Database Management
Instructor: Lenis Hernandez
Relational Algebra
Topics covered:
â€¢ Relational Algebra
â€¢ Unary Relational Operations
â€¢ Relational Algebra Operations From Set Theory
â€¢ Binary Relational Operations
â€¢ Examples of Queries in Relational Algebra
Relational Algebra Overview
â€¢ Relational algebra is the basic set of operations for the relational
model
â€¢ These operations enable a user to specify basic retrieval requests (or
queries)
â€¢ The result of an operation is a new relation, which may have been
formed from one or more input relations
â€¢ This property makes the algebra â€œclosedâ€ (all objects in relational algebra are
relations)
Relational Algebra Overview (continued)
â€¢ The algebra operations thus produce new relations
â€¢ These can be further manipulated using operations of the same algebra
â€¢ A sequence of relational algebra operations forms a relational
algebra expression
â€¢ The result of a relational algebra expression is also a relation that
represents the result of a database query (or retrieval request)
Brief History of Origins of Algebra
â€¢ Muhammad ibn Musa al-Khwarizmi (800-847 CE) â€“ from Morocco wrote a book
titled al-jabr about arithmetic of variables
â€¢ Book was translated into Latin.
â€¢ Its title (al-jabr) gave Algebra its name.
â€¢ Al-Khwarizmi called variables â€œshayâ€
â€¢ â€œShayâ€ is Arabic for â€œthingâ€.
â€¢ Spanish transliterated â€œshayâ€ as â€œxayâ€ (â€œxâ€ was â€œshâ€ in Spain).
â€¢ In time this word was abbreviated as x.
â€¢ Where does the word Algorithm come from?
â€¢ Algorithm originates from â€œal-Khwarizmi”
â€¢ Reference: PBS (http://www.pbs.org/empires/islam/innoalgebra.html)
Relational Algebra Overview
â€¢ Relational Algebra consists of several groups of operations
â€¢ Unary Relational Operations
â€¢ SELECT (symbol: ï³ (sigma))
â€¢ PROJECT (symbol: ï° (pi))
â€¢ RENAME (symbol: ï² (rho))
â€¢ Relational Algebra Operations From Set Theory
â€¢ UNION ( ïƒˆ ), INTERSECTION ( ïƒ‡ ), DIFFERENCE (or MINUS, â€“ )
â€¢ CARTESIAN PRODUCT ( x )
â€¢ Binary Relational Operations
â€¢ JOIN (several variations of JOIN exist)
â€¢ DIVISION
â€¢ OUTER JOINS, OUTER UNION
â€¢ AGGREGATE FUNCTIONS (These compute summary of information: for example, SUM, COUNT,
AVG, MIN, MAX)
Database State for COMPANY
â€¢ All examples discussed below refer to the COMPANY database shown here.
Unary Relational Operations: SELECT
â€¢ The SELECT operation (denoted by ï³ (sigma)) is used to select a subset of the tuples from a
relation based on a selection condition.
â€¢ The selection condition acts as a filter
â€¢ Keeps only those tuples that satisfy the qualifying condition
â€¢ Tuples satisfying the condition are selected whereas the other tuples are discarded
(filtered out)
â€¢ Examples:
â€¢ Select the EMPLOYEE tuples whose department number is 4:
ï³ DNO = 4 (EMPLOYEE)
â€¢ Select the employee tuples whose salary is greater than \$30,000:
ï³ SALARY > 30,000 (EMPLOYEE)
Unary Relational Operations: SELECT
â€¢ In general, the select operation is denoted by ï³ (R)
where
â€¢ the symbol ï³ (sigma) is used to denote the select operator
â€¢ the selection condition is a Boolean (conditional) expression specified on the attributes
of relation R
â€¢ tuples that make the condition true are selected
â€¢ appear in the result of the operation
â€¢ tuples that make the condition false are filtered out
â€¢ discarded from the result of the operation
Unary Relational Operations: SELECT (continued)
â€¢ SELECT Operation Properties
â€¢ The SELECT operation ï³ (R) produces a relation S that has the same
schema (same attributes) as R
â€¢ SELECT ï³ is commutative:
â€¢ ï³ (ï³ < condition2> (R)) = ï³ (ï³ < condition1> (R))
â€¢ Because of commutativity property, a cascade (sequence) of SELECT operations may be
applied in any order:
â€¢ ï³(ï³ (ï³ (R)) = ï³ (ï³ (ï³ ( R)))
â€¢ A cascade of SELECT operations may be replaced by a single selection with a conjunction
of all the conditions:
â€¢ ï³(ï³< cond2> (ï³(R)) = ï³ AND < cond2> AND < cond3>(R)))
â€¢ The number of tuples in the result of a SELECT is less than (or equal to) the number
of tuples in the input relation R
The following query results refer to this database state
Unary Relational Operations: PROJECT
â€¢ PROJECT Operation is denoted by ï° (pi)
â€¢ This operation keeps certain columns (attributes) from a relation and
â€¢ PROJECT creates a vertical partitioning
â€¢ The list of specified columns (attributes) is kept in each tuple
â€¢ The other attributes in each tuple are discarded
â€¢ Example: To list each employeeâ€™s first and last name and salary, the
following is used:
ï°LNAME, FNAME,SALARY(EMPLOYEE)
Unary Relational Operations: PROJECT (cont.)
â€¢ The general form of the project operation is:
ï°(R)
â€¢ ï° (pi) is the symbol used to represent the project operation
â€¢ is the desired list of attributes from relation R.
â€¢ The project operation removes any duplicate tuples
â€¢ This is because the result of the project operation must be a set of tuples
â€¢ Mathematical sets do not allow duplicate elements.
Unary Relational Operations: PROJECT (contd.)
â€¢ PROJECT Operation Properties
â€¢ The number of tuples in the result of projection ï°(R) is always less or
equal to the number of tuples in R
â€¢ If the list of attributes includes a key of R, then the number of tuples in the result of
PROJECT is equal to the number of tuples in R
â€¢ PROJECT is not commutative
â€¢ ï° (ï° (R) ) = ï° (R) as long as contains the attributes in
Examples of applying SELECT and PROJECT operations
Relational Algebra Expressions
â€¢ We may want to apply several relational algebra operations one after
the other
â€¢ Either we can write the operations as a single relational algebra expression
by nesting the operations, or
â€¢ We can apply one operation at a time and create intermediate result
relations.
â€¢ In the latter case, we must give names to the relations that hold
the intermediate results.
Single expression versus sequence of relational operations
(Example)
â€¢ To retrieve the first name, last name, and salary of all employees who work in
department number 5, we must apply a select and a project operation
â€¢ We can write a single relational algebra expression as follows:
â€¢ ï°FNAME, LNAME, SALARY(ï³ DNO=5(EMPLOYEE))
â€¢ OR We can explicitly show the sequence of operations, giving a name to each
intermediate relation:
â€¢ DEP5_EMPS ï‚¬ ï³ DNO=5(EMPLOYEE)
â€¢ RESULT ï‚¬ ï° FNAME, LNAME, SALARY (DEP5_EMPS)
Unary Relational Operations: RENAME
â€¢ The RENAME operator is denoted by ï² (rho)
â€¢ In some cases, we may want to rename the attributes of a relation or
the relation name or both
â€¢ Useful when a query requires multiple operations
â€¢ Necessary in some cases (see JOIN operation later)
Unary Relational Operations: RENAME (continued)
â€¢ The general RENAME operation ï² can be expressed by any of the
following forms:
â€¢ ï²S (B1, B2, â€¦, Bn )(R) changes both:
â€¢ the relation name to S, and
â€¢ the column (attribute) names to B1, B1, â€¦..Bn
â€¢ ï²S(R) changes:
â€¢ the relation name only to S
â€¢ ï²(B1, B2, â€¦, Bn )(R) changes:
â€¢ the column (attribute) names only to B1, B1, â€¦..Bn
Unary Relational Operations: RENAME (continued)
â€¢ For convenience, we also use a shorthand for renaming attributes in
an intermediate relation:
â€¢ If we write:
â€¢ RESULT ï‚¬ ï° FNAME, LNAME, SALARY (DEP5_EMPS)
â€¢ RESULT will have the same attribute names as DEP5_EMPS (same
attributes as EMPLOYEE)
â€¢ If we write:
â€¢ RESULT (F, M, L, S, B, A, SX, SAL, SU, DNO)ï‚¬
ï² RESULT (F.M.L.S.B,A,SX,SAL,SU,
DNO)(DEP5_EMPS)
â€¢ The 10 attributes of DEP5_EMPS are renamed to F, M, L, S, B, A, SX,
SAL, SU, DNO, respectively
Note: the ï‚¬ symbol is an assignment operator
Example of applying multiple operations and RENAME
Relational Algebra Operations from Set Theory: UNION
â€¢ UNION Operation
â€¢ Binary operation, denoted by ïƒˆ
â€¢ The result of R ïƒˆ S, is a relation that includes all tuples that are either in R or
in S or in both R and S
â€¢ Duplicate tuples are eliminated
â€¢ The two operand relations R and S must be â€œtype compatibleâ€ (or UNION
compatible)
â€¢ R and S must have same number of attributes
â€¢ Each pair of corresponding attributes must be type compatible (have same or
compatible domains)
Relational Algebra Operations from Set Theory: UNION
â€¢ Example:
â€¢ To retrieve the social security numbers of all employees who either work in department
5 (RESULT1 below) or directly supervise an employee who works in department 5
(RESULT2 below)
â€¢ We can use the UNION operation as follows:
DEP5_EMPS ï‚¬ ï³DNO=5 (EMPLOYEE)
RESULT1 ï‚¬ ï° SSN(DEP5_EMPS)
RESULT ï‚¬ RESULT1 ïƒˆ RESULT2
â€¢ The union operation produces the tuples that are in either RESULT1 or RESULT2 or both
Figure 8.3 Result of the UNION
operation RESULT â† RESULT1 âˆª RESULT2.
Relational Algebra Operations from Set Theory
â€¢ Type Compatibility of operands is required for the binary set operation UNION ïƒˆ,
(also for INTERSECTION ïƒ‡, and SET DIFFERENCE â€“, see next slides)
â€¢ R1(A1, A2, …, An) and R2(B1, B2, …, Bn) are type compatible if:
â€¢ they have the same number of attributes, and
â€¢ the domains of corresponding attributes are type compatible (i.e. dom(Ai)=dom(Bi)
for i=1, 2, …, n).
â€¢ The resulting relation for R1ïƒˆR2 (also for R1ïƒ‡R2, or R1â€“R2, see next slides) has
the same attribute names as the first operand relation R1 (by convention)
Relational Algebra Operations from Set Theory: INTERSECTION
â€¢ INTERSECTION is denoted by ïƒ‡
â€¢ The result of the operation R ïƒ‡ S, is a relation
that includes all tuples that are in both R and S
â€¢ The attribute names in the result will be the
same as the attribute names in R
â€¢ The two operand relations R and S must be
â€œtype compatibleâ€
Relational Algebra Operations from Set Theory: SET DIFFERENCE
â€¢ SET DIFFERENCE (also called MINUS or EXCEPT) is
denoted by â€“
â€¢ The result of R â€“ S, is a relation that includes all tuples
that are in R but not in S
â€¢ The attribute names in the result will be the
same as the attribute names in R
â€¢ The two operand relations R and S must be
â€œtype compatibleâ€
Example to illustrate the result of UNION, INTERSECT, and
DIFFERENCE
Some properties of UNION, INTERSECT, and DIFFERENCE
â€¢ Notice that both union and intersection are commutative operations; that is
â€¢ R ïƒˆ S = S ïƒˆ R, and R ïƒ‡ S = S ïƒ‡ R
â€¢ Both union and intersection can be treated as n-ary operations applicable to any
number of relations as both are associative operations; that is
â€¢ R ïƒˆ (S ïƒˆ T) = (R ïƒˆ S) ïƒˆ T
â€¢ (R ïƒ‡ S) ïƒ‡ T = R ïƒ‡ (S ïƒ‡ T)
â€¢ The minus operation is not commutative; that is, in general
â€¢ Râ€“Sâ‰ Sâ€“R
Relational Algebra Operations from Set Theory: CARTESIAN
PRODUCT
â€¢ CARTESIAN (or CROSS) PRODUCT Operation
â€¢ This operation is used to combine tuples from two relations in a combinatorial
fashion.
â€¢ Denoted by R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm)
â€¢ Result is a relation Q with degree n + m attributes:
â€¢ Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order.
â€¢ The resulting relation state has one tuple for each combination of tuplesâ€”one from
R and one from S.
â€¢ Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS tuples, then R x S will
have nR * nS tuples.
â€¢ The two operands do NOT have to be “type compatibleâ€
Relational Algebra Operations from Set Theory: CARTESIAN
PRODUCT (cont.)
â€¢ Generally, CROSS PRODUCT is not a meaningful operation
â€¢ Can become meaningful when followed by other operations
â€¢ Example (not meaningful):
â€¢ FEMALE_EMPS ï‚¬ ï³ SEX=â€™Fâ€™(EMPLOYEE)
â€¢ EMPNAMES ï‚¬ ï° FNAME, LNAME, SSN (FEMALE_EMPS)
â€¢ EMP_DEPENDENTS ï‚¬ EMPNAMES x DEPENDENT
â€¢ EMP_DEPENDENTS will contain every combination of EMPNAMES and
DEPENDENT
â€¢ whether or not they are actually related
Relational Algebra Operations from Set Theory: CARTESIAN
PRODUCT (cont.)
â€¢ To keep only combinations where the DEPENDENT is related to the
EMPLOYEE, we add a SELECT operation as follows
â€¢ Example (meaningful):
â€¢ FEMALE_EMPS ï‚¬ ï³ SEX=â€™Fâ€™(EMPLOYEE)
â€¢ EMPNAMES ï‚¬ ï° FNAME, LNAME, SSN (FEMALE_EMPS)
â€¢ EMP_DEPENDENTS ï‚¬ EMPNAMES x DEPENDENT
â€¢ ACTUAL_DEPS ï‚¬ ï³ SSN=ESSN(EMP_DEPENDENTS)
â€¢ RESULT ï‚¬ ï° FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPS)
â€¢ RESULT will now contain the name of female employees and their dependents
Figure 8.5 The CARTESIAN PRODUCT (CROSS
PRODUCT) operation.
continued on next slide
Figure 8.5 (continued)
.
The CARTESIAN PRODUCT (CROSS
PRODUCT) operation
continued on next slide
Figure 8.5 (continued)
PRODUCT) operation.
The CARTESIAN PRODUCT (CROSS
Binary Relational Operations: JOIN
â€¢ JOIN Operation (denoted by
)
â€¢ The sequence of CARTESIAN PRODECT followed by SELECT is used quite commonly to
identify and select related tuples from two relations
â€¢ A special operation, called JOIN combines this sequence into a single operation
â€¢ This operation is very important for any relational database with more than a single
relation, because it allows us combine related tuples from various relations
â€¢ The general form of a join operation on two relations R(A1, A2, . . ., An) and S(B1, B2,
. . ., Bm) is:
R S
â€¢ where R and S can be any relations that result from general relational algebra
expressions.
Binary Relational Operations: JOIN (cont.)
â€¢ Example: Suppose that we want to retrieve the name of the
manager of each department.
â€¢ To get the managerâ€™s name, we need to combine each
DEPARTMENT tuple with the EMPLOYEE tuple whose SSN value
matches the MGRSSN value in the department tuple.
â€¢ We do this by using the join
operation.
â€¢ DEPT_MGR ï‚¬ DEPARTMENT MGRSSN=SSN EMPLOYEE
â€¢ MGRSSN=SSN is the join condition
â€¢ Combines each department record with the employee who
manages the department
â€¢ The join condition can also be specified as DEPARTMENT.MGRSSN=
EMPLOYEE.SSN
Figure 8.6 Result of the JOIN operation
DEPT_MGR â† DEPARTMENT|X| Mgr_ssn=SsnEMPLOYEE.
Some properties of JOIN
â€¢ Consider the following JOIN operation:
â€¢ R(A1, A2, . . ., An)
S(B1, B2, . . ., Bm)
R.Ai=S.Bj
â€¢ Result is a relation Q with degree n + m attributes:
â€¢ Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order.
â€¢ The resulting relation state has one tuple for each combination of tuplesâ€”r from R
and s from S, but only if they satisfy the join condition r[Ai]=s[Bj]
â€¢ Hence, if R has nR tuples, and S has nS tuples, then the join result will generally have
less than nR * nS tuples.
â€¢ Only related tuples (based on the join condition) will appear in the result
Some properties of JOIN
â€¢ The general case of JOIN operation is called a Theta-join: R
S
theta
â€¢ The join condition is called theta
â€¢ Theta can be any general boolean expression on the attributes of R
and S; for example:
â€¢ R.Ai