Skip to content

Instantly share code, notes, and snippets.

@zdxerr
Last active August 29, 2015 13:59
Show Gist options
  • Select an option

  • Save zdxerr/10717726 to your computer and use it in GitHub Desktop.

Select an option

Save zdxerr/10717726 to your computer and use it in GitHub Desktop.
Databases and Information Systems Lecture Notes

Databases and Information Systems

Query Optimization

Logical

  • Selectivity

      selectivity(Sc(R)) = |Sc(R)| / |R|
      selectivity(R1 Xc R2) = |R1 Xc R2| / (|R1| * |R2|)
    
  • Goal: Reduce intermediate query results

  • Query Tree

  • Query Rewirte

    1. cut of and push down selections
    2. combine selections and cartesian products to joins
    3. determine join-sequnce with smallest intermediate result
    4. push down and insert projections where possible
  • Reuse Common Sub-Expressions

Physical

  • Goal: Minimize # of disk accesses
  • Iterators
    • 1-Pass
      • Only works for complex operations if R fits into main memory
    • Multi-Pass
  • Nesting
    • read R1 once, R2 multiple times (R2 < R1)
    • Zig-Zag
    • Sort-Based
    • Index-Based
    • Hash-Based
      • Hash-Join
  • Execution Plan Optimization
    • Goal: Select best execution plan
    • Consider different logical and physical plans
    • Plan selection based on heuristic
    • Plan modification

Transaction Synchronization

  • Transaction: start - commit - abort

  • Goal: High degree of parallelism

  • Problems

    • Dirty Read (uncommitted dependency) A transaction reads data written by a concurrent uncommitted transaction.
    • Non-Repeatable Read A transaction re-reads data it has previously read and finds that data has been modified by another transaction (that committed since the initial read).
    • Lost Update Occurs when two or more transactions select the same row and then update the row based on the value originally selected.
    • Phantom Read A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed due to another recently-committed transaction.
  • Scheduler has to prevent cyclic transaction dependencies (e.g. T1->T2->..->T1)

  • Serializability

    • Any concurent execution of a set of serializable transactions produces the same effect as running them one at a time in some order.
    • Only if dependency graph is acyclic.
  • 2-Phase Locking (2PL)

    • Phases
      1. Expanding phase: locks are acquired and no locks are released.
      2. Shrinking phase: locks are released and no locks are acquired.
    • Susceptible to deadlocks
      • Detection:
        • Cyclic wait graphs Find cycles in a graph constituting waiting transactions (Wait-for-graph).
      • Prevention:
        • Preclaiming Declare all needed resources at the start.
        • Global resource ordering Request resources in a predefined global order.
  • Parallel Validation

    • Maintain set of active transactions (in read but not yet completed write phase); do validation against all transactions in active set.
    • Phases
      1. Read and write to local copy
      2. Validate: Check whether other transactions have modified data that this transaction has used (read or written). This includes transactions that completed after this transaction's start time, and optionally, transactions that are still active at validation time.
      3. Write (optional)
  • Predicative Locking

    • Phantom Row did not exist at the beginning of the transaction.
    • Lock all rows affected by a certain query
  • Recovery Theory

    • A history is recoverable if each transactions is commited after all other transactions it has read from, have been commited.
    • Cascading abort is required if a transaction, from which other transactions have read, is aborted.
    • A schedule is strict if for any two transactions T1 and T2, if a write operation of T1 precedes a conflicting operation of T2 (either read or write), then the commit event of T1 also precedes that conflicting operation of T2.
    • Any strict schedule is cascadeless, but not the converse. Strictness allows efficient recovery of databases from failure.

Recovery

  • Error Classification
    • R1: Error in transaction (not yet commited)
    • R2: Preserve commited transaction on memory loss
    • R3: Withdraw uncommited transaction on memory loss
    • R4: Loss of storage (e.g. harddisk failure)
  • Write modifications from buffer to storage
    • Replace Buffer Pages: steal/nosteal
    • Write modifications of commited transactions: force/noforce
    • Strategies
      • Update in place
      • Twin-Block-Approach
      • Shadow storage
  • Algorithms for Recovery and Isolation Exploiting Semantics
  • Logging
    • Entry: [LogSequenceNumber, TransactionID, PageID, Redo, Undo, PreviousLogSequenceNumer]
    • Types
      • Physical Logging: Save before and after image
      • Logical Logging: Log only operations and state changes (undo, redo)
    • Store LogSequenceNumber with each page, the operation has been performed on, to distingush if storage and buffer are in sync.
    • Write-Ahead-Log Prinzipel
      1. All log records have to be written (to disk) before the commit.
      2. Before a modified page can be swapped all log record have to be written (to disk or temporary).
  • Phases
    1. Analysis
      • Calculation of winners (transactions that have to be restored)
      • And loosers (transaction that have not been commited yet)
    2. Redo of winners
    3. Undo of loosers
  • Recovery during recovery
    • Repeated system failure is not unlikely
    • Extend log during recovery
  • Checkpoints
    • Recovery from state of checkpoint

Atomic Commits in Mobile Networks

  • Atomicity
    • Atomic Commit Protocol leads to a unique decision, where all processes have to come to the same decission (commit or abort).
  • 2-Phase Commit
    1. Voting phase: Request and collect votes from all participant.
    2. Commit phase: Decide based on all votes (and notify participants about the decission). Timeout leads to abort.
  • 3-Phase Commit (non-blocking)
    1. Request and collect votes.
    2. Decide and notify participants.
    3. Wait for acknoledgement of all participants to commit.
  • Multiple-Coordinators (Paxos Consensus)
    • Additional consensus protocol between multiple coordinators.
    • Avoid blocking if the network is partitioned.
  • Cross Layer Commit Protocol (CLCP)
    • Commit descission based on knowledge of majorities
    • Matrix of knowledge (votes known/votes made)
    • everyone broadcast complete matrix
    • timeouts are propagated and acknoledged

Text Compression

  • Properties of compression approaches

    • Streamability
    • Partial decompression
    • Substring search
    • Update
  • Huffman

    • Frequency based encoding
    • Compute Huffman table
      1. Compute frequency table
      2. Compute Huffman Tree
    • Adaptive
      • Tree is computed on the fly
    • Alphabetic
      • Sortable
  • Prediction by Partial Matching (PPM)

    • consider context-sensitive frequencies
  • Fibonacci Code

    • Each character is represented by a fibbonacci number.

    • Each code word ends with "11" and contains no other instances of "11" before the end.

    • Example:

        12 = F(2) + F(4) + F(6) ^= 101011
        11 = F(4) + F(6)        ^= 100011
      
  • Arithmetic Encoding

    • Represent string by a unique interval [0, 1).
    • Adaptive (on-the-fly)
  • Wavelet

    • Tree
      • Organize encoded string as binary tree
    • Trie
  • LZ77

    Example for ANANAS:

       01234567
      |________|____|
      |        |ANAN| → (0,0,A)
      |       A|NANA| → (0,0,N)
      |      AN|ANAS| → (6,2,A)
      |   ANANA|S   | → (0,0,S)
      |  ANANAS|    | → (-,-,-) done
    
    • LZW

    Example for ABCABCDBC:

      A ->   001
      B ->   010
      C ->   011
      D ->   100
      AB ->  101
      BC ->  110
      ABC -> 111
      ABC -> 111
      CD -> 1000
    
      001|010|011|101|011|100|0110
    
  • Gammar Based

    • Sequitur
      • Combine symbols from left to right to new rules, while ensuring digram uniqueness.
      • Encode triple (, , ) on second ocurance of a rule and as referenece on next occurences.
    • String Repair
      • Build an index of digrams and replace most frequent digram with a new grammar rule until no more rules can be combined.
      • Replace non-terminals that are used just once by their right hand side.
  • csvzip

Keyword Search

  • XML
  • Lowest Common Ancestor (LCA)
    • Shortest (SLCA): Node n is SLCA if n is LCA and no descendant of n is LCA
    • Exclusive (ELCA): Node n is ELCA if n is still LCA after all subtrees corresponding to an LCA are removed.
  • Dewey Schema
  • XRank ELCA
  • Multiway SLCA
  • Intersection Based SLCA/ELCA
    • Index list contains node ID iff or any node descendant corresponds to a keyword.
    • Intersection of index lists contains all CAs.
    • All CAs that have no child CA are SLCAs.
    • All CAs that have a number of descendant greáter than the sum of child descendant.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Der Lattenzaun
Es war einmal ein Lattenzaun,
mit Zwischenraum, hindurchzuschaun.
Ein Architekt, der dieses sah,
stand eines Abends plötzlich da -
und nahm den Zwischenraum heraus
und baute draus ein großes Haus.
Der Zaun indessen stand ganz dumm,
mit Latten ohne was herum,
Ein Anblick gräßlich und gemein.
Drum zog ihn der Senat auch ein.
Der Architekt jedoch entfloh
nach Afri- od- Ameriko.
(Christian Morgenstern)
View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View raw

(Sorry about that, but we can’t show files that are this big right now.)

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment