Cumulative, weighted heavier towards content after midterm. Basic ideas and SQL queries from before midterm. Relational algebra will not be asked, but relational algebra is required for query optimization.
- Chapter 7: lecture notes on logical design of relational DB
- Chapter 8: (same) - functional dependencies
- Chapters 11, 12, 13: query processing
- Chapters 14, 15: query optimization
- Database Design (hw 4,5) - Chapter 7, lecture notes - Chapter 8
- Storage and Query Processing
- Transaction Management (hw 6)
- XML (Chapter 23)
(parts continue from midterm?)
Chapter 7
The design process
- requirement specification
- conceptual design => E-R model => conceptual schema
- conceptual requirements
- final stage: logical design, logical schema, physical design
- Redundancy
- incompleteness
Main constructs in E-R drawing:
Entity an individual object - Entity set a set of objects sharing common attributes - Strong entity - Weak entity
Relationship - instance an association among two or more entity instances - set a set of relationship instances => math relation over a collection
of entities
Attributes description or characterization of an entity
Domain a set of data values of th esame type from which attribues draw their values
Mapping cardinality constraints for binary relationships
Drawing
- Entity sets : rectangles
- Attributes : ellipses, or in rectangle of entity
- Relationships : diamonds
Functional dependencies
- reduce redundancy
- Achieve representation power
- Avoid loss of information
- functional dependency: x->y
- trivial functional dependency
-
id, name -> id
is redundant -id, name -> id, name, dept
is not redundant - 1NF, 2NF, 3NF, BCNF
- candidate key
- non-key attribute
- Armstrong's Axioms
- logical implication of functional dependencies
- calculating attribute closure
- minimal cover
- lossless join decomposition
- if
R1 x R2 x ... x Rn = R
- Computing attribute closure
- from F, compute minimal cover of F
- One of - from R and F, compute 3NF decomposition (LLJ and FD-preserving) - from R and F, compute BCNF decompositoin (LLJ, maybe FD-preserving)
For any set F of functional dependencies, Fmin exists
Every relation R has an LLJ and fd-preserving 3NF decomposition
Every relation R has an LLJ BCNF decomposition
Chapter 14 (skip sections 8-10), 15 (mostly 15.1, 15.2), hw6
- Transaction
- ACID properties
- Scheduling - serial schedule - concurrent schedule
- Serializable schedule
- Conflict equivalency / conflict serializability
- Precedence graph
- Locking (exclusive and shared)
- lock compatibility matrix
- 2 stage locking protocol
- wait-for graph
- Add lock/unlock statements according to two-phase locking protocol
- use wait-for graphs for deadlock detection
Two phase locking protocol assures conflict serializability, but not freedom from deadlock
Chapter 11, mostly sections 1-4, 6
- Indexing (introducing index file to speed up datat retrieval)
- Contents of index file (primary index, secondary index, search key, pointer)
- Types of index (ordered, hash)
- B+ tree (balanced)
- Advantages: speeds up retrieval
- Disadvantages: - Time overhead for additional updates to data file - Space overhead for index file(s) storage
- Hashing - hash function (h) - search key field (k) - h(k) = id of the bucket - h is uniform random, easy to compute - provides fast direct access to data - bucket overflow (handled by pointer chains)
Chapter 12 (sections 1-3). Not many questions
- overall process of query processing
- measures of query plan costs
- alternative ways of evaluating one relational algebra operation
Chapter 13 (mostly section 1, 2)
- efficient query plan
- Relational Algebra expression transformation
Chapter 23
Given table data (table rows), produce corresponding XML file
Given XML, draw table data