Skip to content

Instantly share code, notes, and snippets.

@matthewjwolff
Created December 1, 2016 19:21
Show Gist options
  • Save matthewjwolff/c910e8f14c05c12d41df937136f739c3 to your computer and use it in GitHub Desktop.
Save matthewjwolff/c910e8f14c05c12d41df937136f739c3 to your computer and use it in GitHub Desktop.
CSC 4402 Final Review

Final Exam

Cumulative, weighted heavier towards content after midterm. Basic ideas and SQL queries from before midterm. Relational algebra will not be asked, but relational algebra is required for query optimization.

Summary of contents covered after midterm

  • Chapter 7: lecture notes on logical design of relational DB
  • Chapter 8: (same) - functional dependencies
  • Chapters 11, 12, 13: query processing
  • Chapters 14, 15: query optimization

Sections

  • Database Design (hw 4,5) - Chapter 7, lecture notes - Chapter 8
  • Storage and Query Processing
  • Transaction Management (hw 6)
  • XML (Chapter 23)

(parts continue from midterm?)

Part 4: E-R Modeling

Chapter 7

The design process

  • requirement specification
  • conceptual design => E-R model => conceptual schema
  • conceptual requirements
  • final stage: logical design, logical schema, physical design

Major pitfalls in design

  • Redundancy
  • incompleteness

Drawing an E-R Diagram

Main constructs in E-R drawing:

  • Entity an individual object - Entity set a set of objects sharing common attributes - Strong entity - Weak entity

  • Relationship - instance an association among two or more entity instances - set a set of relationship instances => math relation over a collection

    of entities

  • Attributes description or characterization of an entity

  • Domain a set of data values of th esame type from which attribues draw their values

  • Mapping cardinality constraints for binary relationships

Drawing

  • Entity sets : rectangles
  • Attributes : ellipses, or in rectangle of entity
  • Relationships : diamonds

Logical Design of Relational databases

Functional dependencies

Desirable properties of a good design

  • reduce redundancy
  • Achieve representation power
  • Avoid loss of information

Normalization using the theory of functional dependencies

  • functional dependency: x->y
  • trivial functional dependency - id, name -> id is redundant - id, name -> id, name, dept is not redundant
  • 1NF, 2NF, 3NF, BCNF
  • candidate key
  • non-key attribute
  • Armstrong's Axioms
  • logical implication of functional dependencies
  • calculating attribute closure
  • minimal cover
  • lossless join decomposition - if R1 x R2 x ... x Rn = R

Algorithms

  • Computing attribute closure
  • from F, compute minimal cover of F
  • One of - from R and F, compute 3NF decomposition (LLJ and FD-preserving) - from R and F, compute BCNF decompositoin (LLJ, maybe FD-preserving)

Theorems

For any set F of functional dependencies, Fmin exists

Every relation R has an LLJ and fd-preserving 3NF decomposition

Every relation R has an LLJ BCNF decomposition

Part 6: Transaction processing

Chapter 14 (skip sections 8-10), 15 (mostly 15.1, 15.2), hw6

Concepts

  • Transaction
  • ACID properties
  • Scheduling - serial schedule - concurrent schedule
  • Serializable schedule
  • Conflict equivalency / conflict serializability
  • Precedence graph

Chapter 15 concepts

  • Locking (exclusive and shared)
  • lock compatibility matrix
  • 2 stage locking protocol
  • wait-for graph

Methods

  • Add lock/unlock statements according to two-phase locking protocol
  • use wait-for graphs for deadlock detection

Theorems

Two phase locking protocol assures conflict serializability, but not freedom from deadlock

Part 5: Storage and Query Processing

Chapter 11, mostly sections 1-4, 6

Concepts

  • Indexing (introducing index file to speed up datat retrieval)
  • Contents of index file (primary index, secondary index, search key, pointer)
  • Types of index (ordered, hash)
  • B+ tree (balanced)
  • Advantages: speeds up retrieval
  • Disadvantages: - Time overhead for additional updates to data file - Space overhead for index file(s) storage
  • Hashing - hash function (h) - search key field (k) - h(k) = id of the bucket - h is uniform random, easy to compute - provides fast direct access to data - bucket overflow (handled by pointer chains)

Query Processing

Chapter 12 (sections 1-3). Not many questions

  • overall process of query processing
  • measures of query plan costs
  • alternative ways of evaluating one relational algebra operation

Query optimization

Chapter 13 (mostly section 1, 2)

  • efficient query plan
  • Relational Algebra expression transformation

XML

Chapter 23

Given table data (table rows), produce corresponding XML file

Given XML, draw table data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment