-
-
Save zdxerr/10717726 to your computer and use it in GitHub Desktop.
-
Selectivity
selectivity(Sc(R)) = |Sc(R)| / |R| selectivity(R1 Xc R2) = |R1 Xc R2| / (|R1| * |R2|) -
Goal: Reduce intermediate query results
-
Query Tree
-
Query Rewirte
- cut of and push down selections
- combine selections and cartesian products to joins
- determine join-sequnce with smallest intermediate result
- push down and insert projections where possible
-
Reuse Common Sub-Expressions
- Goal: Minimize # of disk accesses
- Iterators
- 1-Pass
- Only works for complex operations if R fits into main memory
- Multi-Pass
- 1-Pass
- Nesting
- read R1 once, R2 multiple times (R2 < R1)
- Zig-Zag
- Sort-Based
- Index-Based
- Hash-Based
- Hash-Join
- Execution Plan Optimization
- Goal: Select best execution plan
- Consider different logical and physical plans
- Plan selection based on heuristic
- Plan modification
-
Transaction:
start-commit-abort -
Goal: High degree of parallelism
-
Problems
- Dirty Read (uncommitted dependency) A transaction reads data written by a concurrent uncommitted transaction.
- Non-Repeatable Read A transaction re-reads data it has previously read and finds that data has been modified by another transaction (that committed since the initial read).
- Lost Update Occurs when two or more transactions select the same row and then update the row based on the value originally selected.
- Phantom Read A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed due to another recently-committed transaction.
-
Scheduler has to prevent cyclic transaction dependencies (e.g. T1->T2->..->T1)
-
Serializability
- Any concurent execution of a set of serializable transactions produces the same effect as running them one at a time in some order.
- Only if dependency graph is acyclic.
-
2-Phase Locking (2PL)
- Phases
- Expanding phase: locks are acquired and no locks are released.
- Shrinking phase: locks are released and no locks are acquired.
- Susceptible to deadlocks
- Detection:
- Cyclic wait graphs Find cycles in a graph constituting waiting transactions (Wait-for-graph).
- Prevention:
- Preclaiming Declare all needed resources at the start.
- Global resource ordering Request resources in a predefined global order.
- Detection:
- Phases
-
Parallel Validation
- Maintain set of active transactions (in read but not yet completed write phase); do validation against all transactions in active set.
- Phases
- Read and write to local copy
- Validate: Check whether other transactions have modified data that this transaction has used (read or written). This includes transactions that completed after this transaction's start time, and optionally, transactions that are still active at validation time.
- Write (optional)
-
Predicative Locking
- Phantom Row did not exist at the beginning of the transaction.
- Lock all rows affected by a certain query
-
Recovery Theory
- A history is recoverable if each transactions is commited after all other transactions it has read from, have been commited.
- Cascading abort is required if a transaction, from which other transactions have read, is aborted.
- A schedule is strict if for any two transactions
T1andT2, if a write operation of T1 precedes a conflicting operation ofT2(either read or write), then the commit event ofT1also precedes that conflicting operation ofT2. - Any strict schedule is cascadeless, but not the converse. Strictness allows efficient recovery of databases from failure.
- Error Classification
- R1: Error in transaction (not yet commited)
- R2: Preserve commited transaction on memory loss
- R3: Withdraw uncommited transaction on memory loss
- R4: Loss of storage (e.g. harddisk failure)
- Write modifications from buffer to storage
- Replace Buffer Pages: steal/nosteal
- Write modifications of commited transactions: force/noforce
- Strategies
- Update in place
- Twin-Block-Approach
- Shadow storage
- Algorithms for Recovery and Isolation Exploiting Semantics
- Logging
- Entry:
[LogSequenceNumber, TransactionID, PageID, Redo, Undo, PreviousLogSequenceNumer] - Types
- Physical Logging: Save before and after image
- Logical Logging: Log only operations and state changes (undo, redo)
- Store
LogSequenceNumberwith each page, the operation has been performed on, to distingush if storage and buffer are in sync. - Write-Ahead-Log Prinzipel
- All log records have to be written (to disk) before the commit.
- Before a modified page can be swapped all log record have to be written (to disk or temporary).
- Entry:
- Phases
- Analysis
- Calculation of winners (transactions that have to be restored)
- And loosers (transaction that have not been commited yet)
- Redo of winners
- Undo of loosers
- Analysis
- Recovery during recovery
- Repeated system failure is not unlikely
- Extend log during recovery
- Checkpoints
- Recovery from state of checkpoint
- Atomicity
- Atomic Commit Protocol leads to a unique decision, where all processes have to come to the same decission (commit or abort).
- 2-Phase Commit
- Voting phase: Request and collect votes from all participant.
- Commit phase: Decide based on all votes (and notify participants about the decission). Timeout leads to abort.
- 3-Phase Commit (non-blocking)
- Request and collect votes.
- Decide and notify participants.
- Wait for acknoledgement of all participants to commit.
- Multiple-Coordinators (Paxos Consensus)
- Additional consensus protocol between multiple coordinators.
- Avoid blocking if the network is partitioned.
- Cross Layer Commit Protocol (CLCP)
- Commit descission based on knowledge of majorities
- Matrix of knowledge (votes known/votes made)
- everyone broadcast complete matrix
- timeouts are propagated and acknoledged
-
Properties of compression approaches
- Streamability
- Partial decompression
- Substring search
- Update
-
Huffman
- Frequency based encoding
- Compute Huffman table
- Compute frequency table
- Compute Huffman Tree
- Adaptive
- Tree is computed on the fly
- Alphabetic
- Sortable
-
Prediction by Partial Matching (PPM)
- consider context-sensitive frequencies
-
Fibonacci Code
-
Each character is represented by a fibbonacci number.
-
Each code word ends with "11" and contains no other instances of "11" before the end.
-
Example:
12 = F(2) + F(4) + F(6) ^= 101011 11 = F(4) + F(6) ^= 100011
-
-
Arithmetic Encoding
- Represent string by a unique interval
[0, 1). - Adaptive (on-the-fly)
- Represent string by a unique interval
-
Wavelet
- Tree
- Organize encoded string as binary tree
- Trie
- Tree
-
LZ77
Example for
ANANAS:01234567 |________|____| | |ANAN| → (0,0,A) | A|NANA| → (0,0,N) | AN|ANAS| → (6,2,A) | ANANA|S | → (0,0,S) | ANANAS| | → (-,-,-) done- LZW
Example for
ABCABCDBC:A -> 001 B -> 010 C -> 011 D -> 100 AB -> 101 BC -> 110 ABC -> 111 ABC -> 111 CD -> 1000 001|010|011|101|011|100|0110 -
Gammar Based
- Sequitur
- Combine symbols from left to right to new rules, while ensuring digram uniqueness.
- Encode triple (, , ) on second ocurance of a rule and as referenece on next occurences.
- String Repair
- Build an index of digrams and replace most frequent digram with a new grammar rule until no more rules can be combined.
- Replace non-terminals that are used just once by their right hand side.
- Sequitur
-
csvzip
- XML
- Lowest Common Ancestor (LCA)
- Shortest (SLCA): Node
nis SLCA ifnis LCA and no descendant ofnis LCA - Exclusive (ELCA): Node
nis ELCA ifnis still LCA after all subtrees corresponding to an LCA are removed.
- Shortest (SLCA): Node
- Dewey Schema
- XRank ELCA
- Multiway SLCA
- Intersection Based SLCA/ELCA
- Index list contains node ID iff or any node descendant corresponds to a keyword.
- Intersection of index lists contains all CAs.
- All CAs that have no child CA are SLCAs.
- All CAs that have a number of descendant greáter than the sum of child descendant.
| Der Lattenzaun | |
| Es war einmal ein Lattenzaun, | |
| mit Zwischenraum, hindurchzuschaun. | |
| Ein Architekt, der dieses sah, | |
| stand eines Abends plötzlich da - | |
| und nahm den Zwischenraum heraus | |
| und baute draus ein großes Haus. | |
| Der Zaun indessen stand ganz dumm, | |
| mit Latten ohne was herum, | |
| Ein Anblick gräßlich und gemein. | |
| Drum zog ihn der Senat auch ein. | |
| Der Architekt jedoch entfloh | |
| nach Afri- od- Ameriko. | |
| (Christian Morgenstern) |
(Sorry about that, but we can’t show files that are this big right now.)
(Sorry about that, but we can’t show files that are this big right now.)
(Sorry about that, but we can’t show files that are this big right now.)
(Sorry about that, but we can’t show files that are this big right now.)
(Sorry about that, but we can’t show files that are this big right now.)
(Sorry about that, but we can’t show files that are this big right now.)