Created
November 13, 2017 22:31
-
-
Save DagnyTagg2013/55dad1ae966180a0118d9f971cd06950 to your computer and use it in GitHub Desktop.
MRU Cache
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
PROBLEM: Implement MRU Cache | |
- i.e. cache the MOST-MOST-RECENTLY-USED-OR-ACCESSED, and drop LEAST-RECENTLY-USED | |
- FIXED-size cache | |
initialized with a size, never to exceed that size | |
- put(key, val), O(1) | |
- get(key) -> val (or None), O(1) | |
TEST CASES: | |
testCache = Cache(3) | |
testCache.put("A", 1) | |
testCache.put("B", 2) | |
testCache.put("C", 3) | |
print testCache.get("B") # 2 | |
testCache.put("D", 4) | |
print testCache.get("A") # One of these should be None | |
print testCache.get("B") | |
print testCache.get("C") | |
print testCache.get("D") | |
""" | |
""" | |
SUMMARY: | |
- Lookup with DICT using String:Node reference | |
- Nodes referenced contained in in PriorityQ implemented as | |
DEQUE | |
- NODE key=String, value=Integer | |
- need random access lookup for Get/Put via DICT | |
- need to determine which Node to evict in PriorityQ; | |
then lookup corresponding DICT entry to remove | |
So KEY has to be part of Node. | |
CACHE EVICTION POLICY VARIATIONs from SIMPLEST to REAL-WORLD: | |
1) RECENTLY WRITTEN-INSERTED | |
- if you just want to evict ANY node; then simple List array implementation with eviction of last appended is OK; but then we have cache policy of retaining OLDEST | |
WRITTEN-INSERTED and not USED-ACCESSED which is bad | |
- LIST implemented as Array in Python makes removal from FRONT | |
expensive due to copy; whereas add to END is less expensive due to amortized | |
block allocation | |
2) ACCESS RECENCY: Most recently accessed CACHED; stale accesses evicted | |
- if we want to evict node by ACCESS recency -- eg retain most recently accessed | |
- so FIFO according to ACCESS, then use Doubly-Linked-List or DEQUE | |
since remove from HEAD and add to TAIL are constant operations | |
3) ATTRIBUTE VALUE: Use HEAP, with value according to ATTRIBUTE | |
- Price (cache High-Price, so evict low-price with MIN-Heap) | |
- Time Written (cache recently written with smallest time difference from NOW, so evict largest time difference with MAX-Heap) | |
TRICK: | |
- MOST RECENTLY ACCESSED is REMOVED from TAIL of Priority Q, | |
then MOVED to HEAD of Q | |
- DICT is UNCHANGED, as holds reference to Node which didn't itself change EXCEPT when Q is FULL: | |
- Node at END of Q is EVICTED on a full Q, | |
then LOOKUP on Node KEY needed to REMOVE it from DICT! | |
ATTENTION TO: | |
- track data IN-SYNC on each PUT modification: Q, DICT, COUNT; | |
which is unaffected with GET: DICT, Q, COUNT | |
- need to EVICT TAIL on FULL Q case; | |
then LOOKUP corresponding KEY in DICT to RELEASE reference! | |
WHAT MATTERS: | |
- DICT always points to Node directly for random-access for GET | |
- PUT needs to check for FULL and use | |
PriorityQ with EVICTION POLICY: | |
- On Eviction from PriorityQ, need Node with both | |
with KEY-VALUE Node for | |
lookup and eviction of corresponding Entry from DICT! | |
- SIMPLEST to have VALUE on DICT reference SAME Node structure as Q! | |
- DEQUE since insert-delete at HEAD and TAIL are needed; | |
AND want efficient DELETE so need BACK ptr! | |
- update COUNT on PUT only | |
DETAILS: | |
get => lookup Dict is NOT modified, instead PriorityQ is | |
- retrieves Node value if it exists from DICT | |
- REMOVE accessed item from TAIL of PriorityQ | |
- ADD it to HEAD to TRACK MR | |
- otherwise returns None if it does not exist | |
set => PriorityQ modified, THEN lookup Dict is synced | |
- if EXISTS, UPDATE its value | |
- if NOT exists and | |
- if NOT FULL at capacity | |
- lookup DICT, and UPDATE value | |
- if FULL at capacity | |
- create new node | |
- REMOVE stale item from HEAD of Priority Q, | |
then goto lookup KEY in DICT to REMOVE corresponding entry | |
- INSERT New Node to TAIL of PriorityQ | |
SIMPLIFYING ASSUMPTIONS: | |
- ignore concurrent access; otherwise have to copy and return MODIFIED | |
collection each time, leaving original dictionary intact | |
- ignore WEAK references on DICT key which allow garbage collection for | |
stale keys | |
- NO DEEP copies required; just REFERENCES to VALUE objects saved; | |
shallow copy OK. eg not copy.deepcopy(), but dict.copy() | |
LANGUAGE-SPECIFIC SUPPORT: | |
- JAVA | |
- LinkedHashMap with ctor parameter for AccessOrder (True for Access, False for Insertion) | |
- PriorityQ with ctor parameter for Comparator where DEFAULT -1 for | |
L < R MIN HEAP; and you multiply by -1 to get REVERSE order to conver to MAX HEAP | |
- java Generics PECS rule: Producer Extend, Consumer Super | |
Comparator<? super E> | |
- https://stackoverflow.com/questions/2723397/what-is-pecs-producer-extends-consumer-super | |
- PYTHON | |
- heapq supports min-heap by default; | |
but needs customization for MAX-Heap | |
- i.e needs ctor-parameterized comparator function for specific contained element type | |
*** BUT I CUSTOMIZED THIS MYSELF -- WHOOHOO! | |
https://github.com/DagnyTagg2013/CTCI-Python/blob/master/HEAP/binheap.py | |
- OR from Queue import PriorityQueue (synchronized and slower) | |
which adds elements as TUPLEs for automatic field-sequence ordering; | |
but maybe also need to do * -1 to REVERSE ordering for a MAX heap | |
- https://stackoverflow.com/questions/407734/a-generic-priority-queue-fo-python | |
- https://docs.python.org/3/library/heapq.html | |
""" | |
from collections import deque | |
class Node: | |
def __init__(self, key=None, value=None): | |
self.key = key | |
self.value = value | |
def __repr__(self): | |
status = "({},{})".format(self.key, self.value) | |
return status | |
class MRUCache: | |
def __init__(self, maxSize=3): | |
self.capacity = maxSize | |
self.currSize = 0 | |
self.lookup = {} | |
self.priorityQ = deque() | |
def __repr__(self): | |
# ATTN print syntax! | |
status = '\nMRUCache\n {}, {}\n{}\n{}'.format(self.capacity, self.currSize, self.lookup, self.priorityQ) | |
return status | |
def get(self, key): | |
# RETURN NONE if not in lookup cache | |
foundNode = self.lookup.get(key, None) | |
# if node found, reorder ACCESS-ORDER | |
if foundNode: | |
# FIND/REMOVE corresponding node reference from priority Q, | |
# (should be SAME reference as above!) | |
# then put it into the FRONT of Q | |
self.priorityQ.remove(foundNode) | |
self.priorityQ.appendleft(foundNode) | |
# RETURNs object reference if in lookup cache or None if not found | |
return foundNode | |
def put(self, key, value): | |
currNode = None | |
# UPDATE value if ALREADY in lookup cache | |
if key in self.lookup: | |
currNode = self.lookup[key] | |
# ATTN: Dict only saves REFERENCE to pairNode, so modify | |
# value on that Node! | |
currNode.value = value | |
# FIND corresponding node from Q, | |
# then put it into the FRONT of Q, | |
# - NOTE this is the SAME Node reference as ABOVE! | |
# ATTENTION: TIME TO SEARCH is O(n) on DEQUE | |
existNode = self.priorityQ.remove(currNode) | |
# otherwise, CREATE new Node, and ADD it to lookup cache | |
else: | |
# create new node | |
currNode = Node(key, value) | |
self.currSize = len(self.priorityQ) | |
# ATTN : check FWD bound before EVICTION | |
if ((self.currSize + 1) > self.capacity): | |
# ATTN: update Q WITH its corresponding DICT Entry! | |
# IFF we're at CAPACITY, first remove STALE node at | |
# END of priorityQ, then remove its corresponding entry from the lookup | |
# cache | |
staleNode = self.priorityQ.pop() | |
# ATTN: this is the way to REMOVE from a DICT in Python! | |
staleNode = self.lookup.pop(staleNode.key, None) | |
# COMMON-CODE, | |
# accessed updated/added node ALWAYs gets added to HEAD of Q, | |
# AND exists in the lookup cache based on Node key | |
self.priorityQ.appendleft(currNode) | |
self.lookup[currNode.key] = currNode | |
# UPDATE size! | |
self.currSize += 1 | |
# ************ TEST DRIVER ****************** | |
print("\n*** INIT CACHE") | |
testCache = MRUCache(3) | |
print(testCache) | |
print("\n*** PUT A") | |
testCache.put("A", 1) | |
print(testCache) | |
print("\n*** PUT B") | |
testCache.put("B", 2) | |
print(testCache) | |
print("\n*** PUT C") | |
testCache.put("C", 3) | |
print(testCache) | |
print("\n*** GET B") | |
print(testCache.get("B")) # 2 | |
print(testCache) | |
print("\n*** PUT D") | |
testCache.put("D", 4) | |
print(testCache) | |
print("\n*** GET A") | |
print(testCache.get("A")) # THIS should be None | |
print(testCache) | |
print("\n*** GET B") | |
print(testCache.get("B")) | |
print(testCache) | |
print("\n*** GET C") | |
print(testCache.get("C")) | |
print(testCache) | |
print("\n*** GET D") | |
print(testCache.get("D")) | |
print(testCache) | |
print("\n*** UPDATE B") | |
testCache.put("B", 100) | |
print(testCache) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment