Analyzing Performance of Arrays and Objects
Algorithm and Problem Solving Approaches
Algorithm and Problem Solving Patterns
Dijkstra's Shortest Path Algorithm
We use Big O notation to measure how optimized an algorithm is. It is possible to create algorithms that solve problems. It is difficult to create algorithms that solve problems with effective space complexity and time complexity. We want to use those whenever possible.
It's kind of like the Richter scale with earthquakes; we can use it to approximate how performant an algorithm is.
Time Complexity from best to worst:
- O(1) - Constant time.
- O(log n) - Very close to (1).
- O(n) - Linear time. Each value is calculated.
- O(n log n)
- O(n^2) - Quadratic time. For each value, access all values.
- O(n!) - Traveling Salesman
In Chrome, we can use performance.now() to measure time. Note: Execution time alone does not tell us whether or not an algorithm is appropriate, it is merely a tool we can use to compare our own implementations.
let t1 = performance.now()
// Do some calculation
let t2 = performance.now()
// Calculate the delta time.
let delta = t2 - t1
// That's how long it took!
How can we analyze the runtime of an algorithm as the size of the inputs increases?
Consider the 'Reverse a String' challenge. Immediately, there are tons of different ways we can solve this problem. A few are:
- For each character, pre-append it to a string.
- For each character, slice and re-append until we have acknowledged all of the initial characters.
- Split and reverse the array using build in string and array methods.
And there are many more. What's important is, so long as you can solve the problem and talk about your particular solution, and explain why it is an optimized solution, it is a good solution. We can use Big O to classify this algorithm and compare it to others. The above solutions are all O(n) because we acknowledge each character in the String, or 'n' characters.
A solution that is easy to understand, is well optimized, and most importantly, solves the problem, is the best solution. It is fair to brute force a solution before optimizing it. That is, try to solve the problem first before looking for opportunities to make it more efficient. Remember, if your solution doesn't solve the problem at all, it doesn't matter how well optimized it is.
In interview technical challenges, optimization and understanding are paramount to success.
var addUpToSimple = (1 ... n) => {
if(n == 0 || n == 1) return n // Base case. If there's only one value, return it.
for(var i = 0; i <= n; i++) {
total += i
}
return total
}
var addUpToOptimized = (1 ... n) => {
return n * (n + 1) / 2
}
To get to the latter, we look for shortcuts.
If n = 3
n + n-1 + n-2 = result
3 + 3-2 + 3-1 = 6
But is there another way to get to 6? How about double?
n * (n + 1) = result
3 * (3 + 1) = 12
So that's interesting. But do we have a pattern? What if n = 4?
4 * (4 + 1) = 20
and
4 + 4-3 + 4-2 + 4-1 = 10
So what if we just do that initial calculation, and then divide it?
n = 3
(n * (n + 1)) / 2 = result
(3 * (3 + 1)) / 2 = 6
We've successfully optimized this solution. How do we know this?
Let's say we use that performance.now() method. There are still some concerns:
- Different machines record different times
- The same machine will record different times; they may be too fast
- Speed measurements may not be precise enough
We can use Big(O) notation to approximate best, average, and worst case algorithm timing using the number of simple operations performed.
Let's count the number of operations for each of the Add Up functions.
var addUpToSimple = (1 ... n) => {
for(var i = 0; i <= n; i++) {
total += i
}
return total
}
In the for loop, we have n add operations, because we add for each value leading up to n. Also in the for loop, we increment i, so that's an additional n add operations.
Assignments (=) is technically an operation, but that's cheap compared to other mathematical operations.
We also have a comparison in the for loop: We continue until i <= n.
But we don't have to worry about all those specifics because, on average, the number of operations grows proportionally with n. Therefore, on average, our Big O for this is O(n).
var addUpToOptimized = (1 ... n) => {
return n * (n + 1) / 2
}
One Multiplication, one addition, and one division. Regardless of n, we only perform one of each operation.
Again, we are interested in the average. Our Big O is O(1); we only perform one operation, one time, on average.
At this point, imagine a function that counts to n, then counts down from n back to 1. For each value, we echo out that value. We can imagine this as two separate for loops.
Note: Not to be confused with a for loop nested in a for loop. For n, we count to n. Then, we count from n back down to 1. What is the time complexity of this?
We have another function that prints all pairs. For each value in n, we print it alongside every other value of n. This is a for loop nested in another for loop. What is the time complexity of this?
Consider the two functions and think about what their time complexity Big O notation would be:
for(let i = 0; i <= n; i++) { console.log(i) }
for(let i = n.length; i >= 1; i--) { console.log(i) }
```js
for(let i = 0; i < n; i++ {
for(let k = 0; k < n; k++ {
console.log(i, j)
}
}
The first is O(2n), as we have to iterate through n twice. However, we may simplify it down to O(n). This is explained in the next section.
The second is O(n^2), as, for each value in n, we iterate through all other n values.
Counting all the different operations can be tricky, so we can simplify this by looking for the most essential operations, or the operations that are the most limiting.
There are some rules we can follow.
O(2n) can be simplified to O(n).
O(500) can be simplified to O(1).
O(13n^2) can be simplified to O(n^2)
Why? Because these are the most defining features of our algorithms; we do not want the specifics, we want broad generalizations.
O(n + 10) can be simplified to O(n)
O(1000n + 50) can be simplified to O(n)
O(n^2 + 5n + 8) can be simplified to O(n^2), because that is our most limiting point.
- Arithmetic operations are constant
- Variable assignment is constant
- Accessing element in an array by index or object by key is constant
- In a loop, the complexity is the length of the loop
Imagine a function that runs at least some number of times (N). It could run N to n times. Ultimately, it is most dependent on n, so we would still call this O(n).
At a certain point, we reach constant time. The lower N is, the more reliably we can say it is O(1), because it will not grow in linearity after a certain point. However, if N is infinity, then we could argue that it remains O(n), as most of the time, we will be dealing with n values (and operations).
How much additional memory do we need to allocate in order to run the code in our algorithm?
As n grows, the size of the input itself is going to grow. Auxiliary space complexity refers exclusively to the space required by the algorithm rather than the algorithm and the input. This is what is covered here.
- Most primitives are constant space
- Strings require O(n) space. They can vary in length.
- Reference types, arrays, and objects are O(n) for similar reasons; they store variable amounts of data.
An example:
var sum = (arr) => {
let total = 0
for(let i = 0; i < arr.length; i++) { total += arr[i] }
return total
}
Total and i take up constant space; they only store one piece of information. The array would make this O(n) space complexity, but with regards to auxiliary space complexity, this is merely an O(1) algorithm.
var doubleArray = (arr) => {
let doubledArray = []
for(let i = 0; i < arr.length; i++) { doubledArray.push(arr[i] * 2) }
return doubledArray
}
We are pushing values on to an array that will be n size, so the algorithm is O(n) in auxiliary space complexity.
Some algorithms involve more complex, mathematical expressions than O(1), O(n), or O(n^2)+. Logarithms are the inverse of exponentiation.
"The logarithm of a number roughly measures the number of times you can divide that number by 2 before you get a value that is less than or equal to 1."
Examples:
log2(value) = exponent
2^exponent = value
------------
If we know all values:
log2(8) = 3
2^a = 8
2^3 = 8
2 x 2 x 2 = 8
------------
If we need to solve a logarithm:
log2(8)
Iteration 1: 8 / 2
Iteration 2: 4 / 2
Iteration 3: 2 / 2
<= 1. Stop! 1
It took 3 divisions by 2 to reach <= 1.
Therefore, log2(8) = 3
25 does not divide as cleanly.
log2(25)
Iteration 1: 25 / 2
Iteration 2: 12.5 / 2
Iteration 3: 6.25 / 2
Iteration 4: 3.125 / 2
Iteration 5: 1.5625 / 2
<= 1. Stop! 0.78125
It took 5 divisions by 2 to reach <= 1.
But, if we are calculating this precisely:
log2(25) = 4.64
For our purposes, log === log2. We won't really see others, but log always requires a base.
So why study logarithmic time complexity? Because it is used in search algorithms, efficient sorting algorithms, and, sometimes, recursive space complexity.
Objects are unordered data structures.
- Use them when you don't need order
- Use them when you need fast access, insertion, and removal
- Insertion O(1)
- Removal O(1)
- Searching O(n)
- Access O(1)
- Object.keys O(n)
- Object.values O(n)
- Object.entries O(n)
- Object.hasOwnProperty - O(1)
We access keys, values, and entries by looking for them. Objects can have n keys, values, and entries. Therefore, O(n).
hasOwnProperty just tells us if an Object has some property. It either does or it doesn't. We don't look for values within that property or anything. Therefore, O(1).
Arrays are the only array we get for free in JavaScript. Lists and Vectors in other languages come similar additional functionality and feature an even more basic base array data structure. Linked Lists have to be created.
- Use when you need order
- When you need fast access, insertion, and removal (depending on n)
- Insertion O(it depends...)
- Removal O(it depends...)
- Searching O(n)
- Access O(1)
If we are inserting at the end of an array, it is O(1). If we are inserting in the middle, or at the beginning, we have to mutate the array by shifting its contents. This complicates things because we have to re-index all of the elements in the array.
Removal has the same Big O constraints. It depends on how complex the operation is.
In both cases, treating an array like a stack is the most Big O efficient. Treating it like a queue is not. Interacting with elements in between is not.
- push O(1)
- pop O(1)
- shift O(n)
- unshift O(n)
- concat O(n)
- slice O(n) (Cut elements out using start, end indices)
- splice O(n) (Changes the array by removing and/or adding new elements)
- sort O(n * log n) (Quick Sort)
- for Each/map/filter/reduce/etc - O(n)
A process or set of steps to accomplish a certain task. It is the foundation of programming and comes up in interviews.
- Devise a plan for solving problems
- Master common problem solving problems; break them down into different categories
- Understand the problem
- Explore concrete examples
- Solve/simplify the problem
- Review and refactor
Before you start coding, understand what is going on. You can ask some questions:
- Can you restate the problem in your own words?
- What are the inputs that go into the problem?
- What are the expected outputs?
- Can the outputs be determined from the inputs?
- How should I label the importance pieces of information in the data?
Write a function that adds two numbers.
Two values. Integers? Floating points? How large will they be? Upper and lower bounds?
One value.
Yes, we have all the information we need and just need to calculate.
We have two values and a result/sum value.
Come up with examples that can help you understand the problem better. We can use these use cases to determine expected output and, as we begin coding, we can check to see if that output matches our expected output. e.g. in the previous example, we expect 3 + 2 to = 5. If our function outputs 6 then we will immediately know something needs to change.
- User Stories
- User Tests
- Start with simple examples
- Progress to more complex examples
- Explore simple examples with empty inputs
- Explore examples with invalid inputs
'cat' should return {c: 1, a: 1, t: 1} 'hello' should return {h: 1, e: 1, l: 2, o: 1}
Do we only want to return the letters that are there? What about the letters that aren't there?
If the letters are all already in our result output, then we can add them up front, instead of checking to see if it exists in our map as we go.
'hello friend' 'HeLlo fRiEND 123$!'
What do we do with the white space? Numbers? Symbols? Should we lowercase our input so that both uppercase and lowercase L's get tallied together?
Null, Undefined, and empty string should return 0.
What happens when we pass in a boolean value? An integer value?
We could try parsing it to a String. We could throw it out/return an invalid input message.
Write comments or pseudo code that outline the steps you need to take to accomplish your goal. In an interview setting, it's okay to ask for feedback. "Does this seem right to you?" "Does this make sense?"
var charCounter = (str) => {
// If string is invalid, throw it out or return 0
// Create empty dictionary. No pre-populating keys (letters a-z, numbers 0-9)
// For each character in string, examine.
// Is this a letter or a number? If not, move on.
// Does our dictionary contain this item?
// Increment this item.
// No? Add this letter and set it to 1.
// Return or print dictionary
}
If you cannot solve the problem solve the simpler problem.
- Find the core difficulty in what you are trying to do
- Temporarily ignore that difficulty
- Write a simplified solution
- Incorporate that difficulty back in
Let's assume that we struggle with looping. Let's tackle the remainder of the problem first. There's even an alphanumeric regex below and, admittedly, I did not have that memorized when I wrote this, so that may be something else you skip and return to later. Or you can tell the interviewer you do not recall how to check for alphanumeric characters and they may help you out because you have already expressed that it is a step that needs to happen.
var charCounter = (str) => {
str = str.toLowerCase()
let result = {} // No prepopulation.
var letterNumberPattern = /^[0-9a-zÀ-ÿ]+$/
let c = str[0]
if((c.value.match(letterNumberPattern)) {
if(result[c]) {
result[c]++
} else {
result[c] = 1
}
}
return result
}
Now that we know we are accurately counting some element, we can work on the core difficulty of looping.
var charCounter = (str) => {
str = str.toLowerCase()
let result = {}
let letterNumberPattern = /^[0-9a-zÀ-ÿ]+$/
for(let i = 0; i < str.length; i++) {
let c = str[i]
if((c.value.match(letterNumberPattern)) {
if(result[c]) { // We have seen this before
result[c]++
} else { // First time seeing this
result[c] = 1
}
} // Else skip, because it is not alphanumeric
}
return result
}
Just because it works doesn't mean you are finished. This is where you consider Big O and determine if you have the most optimized solution for your problem.
- Can you check the result?
- Can you derive the result differently?
- Can you understand it at a glance?
- Can you use the result or method for some other problem?
- Can you improve the performance of your solution?
- Can you think of other ways to refactor?
- How have others solved this problem?
var charCounter = (str) => {
const result = {}
for(let c of str) {
c = c.toLowerCase()
if((/^[0-9a-zÀ-ÿ]+$/.test(c)) { // We could also, instead, test for ASCII code, which is more performant
result[c] = result[c]++ || 1 // If result[c], increment. Else, set 1
}
}
return result
}
Recognizing common problem solving patterns can help break complex problems down into simpler ones.
- Frequency Counter
- Multiple Pointers
- Sliding Window
- Divide and Conquer
- Greedy Algorithms
- Dyanmic Programming
- Backtracking
Here, we can iterate through our data and note where or how many occurrences of that data exist. We can store indicides to refer to later or have a pre-processed total count of character occurrences.
var isAnagram1 = (arr1, arr2) => {
/// This requires two dictionaries. In our final loop, we fact check each dictionary against the other.
/// We do not need to update either dictionary after it has been populated.
if(arr1.length != arr2.length) return false
let freqCounter1 = {}
let freqCounter2 = {}
// We know they are the same length, so we can use the same for loop.
for(let i in arr1) {
freqCounter1[arr1[i]] = (freqCounter1[arr1[i]] || 0) + 1
freqCounter2[arr2[i]] = (freqCounter2[arr2[i]] || 0) + 1
}
for(let key in freqCounter1) {
if(!(freqCounter2[key])) return false // Does not contain this letter
if(freqCounter1[key] != freqCounter2[key]) return false // Differing counts, so there's a mismatch
}
return true
}
var isAnagram2 = (arr1, arr2) => {
/// This uses a single counter that we update when we iterate through the second array.
/// If, after updates, we encounter a value in the first but has an allowance of <= 0 in the second, we know we've failed.
/// Otherwise, both arrays have all characters accounted for, and we succeed.
if(arr1.length != arr2.length) return false
let counter = {}
// We know they are the same length, so we can use the same for loop.
for(let i in arr1) {
counter[arr1[i]] = (counter[arr1[i]] || 0) + 1
}
for(let i in arr2) {
if(!counter[arr2[i]]) return false // Doesn't exist
counter[arr2[i]]--
}
return true
}
let arr1 = ['a', 'b', 'c', 'c']
let arr2 = ['c', 'b', 'c', 'a']
let result1 = isAnagram1(arr1, arr2)
let result2 = isAnagram2(arr1, arr2)
console.log(result1)
console.log(result2)
Pointers can be used to keep track of where we are in our data without having to use additional memory or create separate work to iterate over the same data. Previous examples used loops with a single pointer (i) to track our index position. We can use pointers to iterate forwards, backwards, both simultaneously, or store information on where certain data is located.
The following example works on a sorted array. However, one could solve a Zero Sum solution with a counting frequency approach by creating a dictionary of complimenting values and their indices. This is our first example of being able to apply two different problem solving patterns to the same problem.
The following examples use 'v' as a pointer or cursor.
When we loop, we usually have an index:
v
[1, 2, 3, 4, 5]
v
[1, 2, 3, 4, 5]
v
[1, 2, 3, 4, 5]
v
[1, 2, 3, 4, 5]
v
[1, 2, 3, 4, 5] Stop! All done.
However, we can have multiple pointers. Here, we loop until our pointers cross or intersect.
v v
[1, 2, 3, 4, 5]
v v
[1, 2, 3, 4, 5]
vv
[1, 2, 3, 4, 5] Stop! All done.
var findZeroSum = (sortedArray) => {
/// Find the values that sum 0
let result = []
let left = 0
let right = sortedArray.length - 1
while (left < right) {
// Walk towards one another and find a zero sum.
console.log(sortedArray[left])
console.log(sortedArray[right])
console.log(sortedArray[left] + sortedArray[right])
let sum = sortedArray[left] + sortedArray[right]
if (sum === 0) {
return [sortedArray[left], sortedArray[right]]
}
if (sum < 0) {
left++
}
else {
right--
}
}
return []
}
let arr1 = [-2, 0, 1, 3]
let arr2 = [-3, -2, -1, 0, 1, 2, 3]
console.log(findZeroSum(arr1)) // Not possible
console.log(findZeroSum(arr2)) // -1, 1
For this problem, we use two pointers and a sorted array. We can use the sorted array to store information about unique values because, once we have iterated over older elements, we no longer need the data.
i and k both start at 0 and 1 respectively. k scouts ahead to determine when we have encountered a new value. When a new value is encountered, we increment i and store that value at i's location.
The algorithm ceases once k runs out of data. Then, i marks the length of our unique values, and we simply return i, whose index serves as a counter for these unique values.
var countUniqueValues = (sortedArray) => {
if(sortedArray.length === 0) return 0
let i = 0
let k = 1
while(k <= sortedArray.length) {
if(sortedArray[i] == sortedArray[k]) {
k++
} else {
i++
sortedArray[i] = sortedArray[k]
}
}
return i
}
let sortedArray1 = [1, 1, 2, 2, 2, 3, 3, 4, 5]
let sortedArray2 = [1, 1, 2, 2, 2, 3, 3, 4, 5, 5, 5, 5, 6, 6, 7]
console.log(countUniqueValues(sortedArray1))
console.log(countUniqueValues(sortedArray2))
This pattern involves creating a window which can either be an a rray or number from one position to another. Depending on a certain condition, the window either increases or closes and a new window is created.
Imagine you have an array of numbers on a piece of paper. You take another piece of paper, and cut a hole in it that only reveals two numbers in your array when placed on top of it. We we iterate, we can shift this paper (window) to reveal the next pair of numbers.
We can use this to track a subset or data in an array or string.
// Here, we will use the '#' to represent the beginning and end of our window.
// Imagine a function where we need to find a max sub array sum. We are given an array and a window size.
// For each value, we need to increment and consider all values in the window.
// This array is passed with a window size of 2. We iterate through using this window size.
// On the left, a full representation. On the right, the data we are working with in the current iteration.
// [#1, 3#, 5, 2, 4, 7] : 1 + 3
// [1, #3, 5#, 2, 4, 7] : 3 + 5
// [1, 3, #5, 2#, 4, 7] : 5 + 2
// [1, 3, 5, #2, 4#, 7] : 2 + 4
// [1, 3, 5, 2, #4, 7#] : 4 + 7
// We 'slide' our window when we increment, then consider values in that window.
We initially capture the sum of our first window. Then, for each subsequent window, we calculate the window's sum by using the previous sum, subtracting the previous values we no longer need, and adding the new values. If this new sum is greater than our current max sum, we replace it. The result is an O(n) solution.
var maxSubarraySum = (arr, windowSize) => {
let maxSum = 0
let tempSum = 0
for(let i = 0; i < windowSize; i++) {
maxSum += arr[i]
}
tempSum = maxSum
for(let i = windowSize; i < arr.length; i++) {
// Our temp sum is currently the previous two values.
// We remove the farthest previous value because it is not part of this window.
// We then add in the next value because it is in this window.
tempSum = tempSum - arr[i - windowSize] + arr[i]
maxSum= Math.max(maxSum, tempSum) // Replace the max if our new sum is greater.
}
return maxSum
}
var arr1 = [2, 6, 9, 2, 1, 8, 5, 6, 3]
console.log(`Window Size 2: ${maxSubarraySum(arr1, 2)}`)
console.log(`Window Size 4: ${maxSubarraySum(arr1, 4)}`)
console.log(`Window Size 1: ${maxSubarraySum(arr1, 1)}`)
This pattern involves dividing a data set into smaller chunks and then repeating a process with a subset of data. This pattern can tremendously decrease time complexity. This is critical for sorting algorithms such as quick sort and merge sort. We use these for binary search trees as well. If you are not familiar with recursion, there is an upcoming section that covers it. If you are familiar with recursion, here is a quick example of how we can recursively search an array for some value.
const nums = [1, 2, 3, 4, 5, 6]
const recursiveSearch = (arr, target) => {
if(arr.length === 0) return false
if(arr.length === 1 && arr[0] === target) return true
if(arr.length === 1 && arr[0] !== target) return false
let mid = Math.floor(arr.length / 2)
let left = (arr).slice(0, mid)
let right = (arr).slice(mid, arr.length)
return recursiveSearch(left, target) + recursiveSearch(right, target)
}
console.log("Find 6: " + recursiveSearch(nums, 6)) // True
console.log("Find 3: " + recursiveSearch(nums, 3)) // True
console.log("Find 7: " + recursiveSearch(nums, 7)) // False
Opposite iterative problem solving (looping), recursion relies on calling the same function multiple times, from within itself, until we reach an endpoint.
The town sent the boy Martin to ask the Dragon if any of the numbers in a list are odd.
He presents the entire list to the Dragon:
"Are any of these numbers odd?"
"I will only tell you if the first number in the list is odd."
So he decides to present the entire list. "Is the first number odd?" "No."
He removes the first item from his list and presents the remaining list. "How about this list?" "No."
He removes that item, and presents the list one more time, with only one number remaining. "And now?" "No."
He shows the dragon the empty list. "And now?" "There's nothing left!"
So Martin infers that none of the numbers in the original list are odd, despite having presented them in fewer and fewer portions.
This is recursion. We keep asking questions until we have touched every data point we need and hit our base case; the end point.
A process that calls itself. We use functions. It is everywhere.
- JSON.parse / JSON.stringify
- document.getElementByID and DOM traversal algorithms
- Object traversal
- Interacting with more complex data structures
It is sometimes a cleaner alternative to iteration.
In almost all program langauges there is a built in data structure tha tmanages what happens when functions are invoked. When a function is invoked, it is placed on top of the call stack. When the language sees the return keyword or when the function ends the compiler will remove it.
In iterative programming, the call stack may push a function on, and that function may call another function, resolve, be removed, and then the first function is removed. Things are constantly being added and removed.
In recursive programming, we keep piling function calls on to the call stack until we hit our base case. Then we remove them, in order, all at once.
The first thing we write in a recursive function is the base condition / end point because the first thing we want to do is determine if we are finished working. Without a base case, we have an infinite loop.
We write a function that counts down from some number. Consider a real use case. If I ask you to count down from 5, when will you stop? 4? 3? 2? 1? 0? -1? -2? I would wager you would stop at 1 or 0. That is our base case. Here, I have chosen to stop counting down when we reach 0.
Otherwise, we print the current number and pass that decremented number to another instance of countdown (value - 1).
var countdown = (count) => {
if(!count) return
console.log(count)
return countdown(count - 1)
}
countdown(5) // 5, 4, 3, 2, 1
If this is difficult to conceptualize consider keeping track of count's value as it moves through the call stack:
- We start with 5. Is it 0? No? Print and call countdown, but pass 5 - 1 to it.
- We now have 4. Is it 0? No. Print and pass 4 - 1.
- Now 3. It is not 0. Print. Pass 2.
- Now 2. Pass 1.
- Now 1. Print, pass 0.
- Now 0. We confirm that it's 0 and immediately return, which triggers the previous return, and the previous, etc.
We are not doing anything with this return value, and so we simply exit. In subsequent recursive functions, we may return values. For example, in the recursiveSearch() function from before, we return a boolean value telling us whether or not a value has been found in a list.
From our starting value, all the way until we reach 1, we take this value and add it to our sum. One we hit 1, we simply return and add 1 to our sum. An easier way to think about this may be to work through it backwards. Imagine our final call to sumRange is actually our first:
We start with sum += 1 Then sum += 2 ... sum += 5
This almost looks like a forward iterative loop, doesn't it? We could write a for loop that adds all values between 1 and n. Recursion is doing the same thing, just backwards, and by calling itself.
var sumRange = (num) => {
if(num === 1) return 1
console.log(num)
return num + sumRange(num - 1)
}
Below, we have two implementations of a factorial function; one is iterative, the other is recursive. They both accomplish the same goals.
var factorialIterative = (num) => {
let total = 1
for(let i = num; i > 1; i--) {
total *= i
}
return total
}
var factorialRecursive = (num) => {
if(num === 1) return 1
return num * factorialRecursive(num - 1)
}
- Having No or an Incorrect Base Case
Pretty self explanatory. If you don't have an exit condition, you will be stuck in an infinite loop. This results in a stack overflow; the stack has too many calls and can no longer handle them.
- Returning the Wrong Data Or Not At All
If we are going through the trouble of doing all the work, we should have a means of bringing that work with us. Our recursion may be iterating through the appropriate data, but if we do not capture it, it is worthless.
The previous recursive functions were called directly, and then they called themselves. But we can write recursive functionality into other functions. We can write an inner, recursive function that does work related to the outer function. We could use this in conjunction with a concept like closures wherein our helper recursive function does work using some hidden outer property. See my guide on advanced ES6+ JavaScript for more on Closures.
Here, we have an inner recursive helper function that collects odd values from an array. However, in practice, we would call this outer function and pass an array to it. The inner function does all the work, but the outer serves as an interface to it and declares the result array we use to collect these odd values.
var collectOddValues = (arr) => {
let result = []
var helper = (input) => {
if(input.length === 0) return
if(input[0] % 2 !== 0) result.push(input[0] % 2)
helper(input.slice[1])
}
return result
}
Instead of a recursive helper function, we can write a pure recursive function that does not rely on an outer function to manage its data.
Here, for each recursive call, we are updating the result by concatenating all of the other results on to it. Instead of just one result that we modify, we instead declare a new result for each recursive call, do work on it, and return it to be concatenated on to the previous result. This results in one master result and disposes of all other recursively created results.
For arrays, use methods like slice and concat to make copies of arrays so they do not get mutated. The modern ES6+ spread operator is also used to interact with arrays without mutating them.
var pureCollectOddValues = (arr) => {
let result = []
if(!arr.length) return result
if(arr[0] % 2 !== 0) result.push(arr[0])
result = result.concat(pureCollectOddValues(arr.slice(1)))
return result
}
Calculate up to the nth Fibonacci number.
var fib = (n) => {
if(n <= 2) return 1
return fib(n - 1) + fib(n - 2)
}
In order, we examine each item, and determine if this item is the one we want. We can do this with a for loop in O(n) time and space complexity. This does not merit an example as it is very similar to the iterative looping earlier in the guide.
This works when we have an array of sorted data. We can use a divide and conquer approach to break a single array down into sub arrays and look for a specific piece of information by asking questions about the current subset. Recursively, we can keep dividing our arrays until we are only working with one piece of information. Then we ask "Is this what we want?" If it is, we return true or that piece of data. If it isn't, we return false or some indicator that it wasn't found.
See the recursiveSearch() implementation from the Divide and Conquer section for a concrete example.
Here, for a larger string, we look for a specific substring. On each iteration of the source string, as soon as we see a starting character that matches our target starting character, we begin looking for the next substring match until we either have a match or we have determined it is not a match and move on. If we have a match, we increment a counter. This is O(n^2) because of the nested for loop.
var basicStringSearch = (str, target) => {
let count = 0
for(let i = 0; i < str.length; i++) }
for(let k = 0; k < target.length; k++) {
if(target[k] !== str[i + k] {
break; // We are no longer matching our target
}
if(k === short.length - 1) count++ // We completed a match!
}
}
return count
}
All of these sorting algorithms are iterative and fairly easy to follow. However, they are all in O(n^2) time. The final sort, insertion sort, is the only sorting algorithm with practical applications. However, it is important to understand the basics of sorting in order to achieve more efficient sorting. These sorting algorithms are not useless; they do perform well in small data sets. However, in the real world, we are not typically working with small data sets.
Bubble sort is the most basic sorting algorithm. It performs in O(n^2) time as, for each value, we iterate over all other values. We mutate the array in place, but we are limited by that nested for loop. Bubble Sort does fairly average in all scenarios, but there are more complex sorting algorithms that exceed it.
var bubbleSort = (arr) => {
for(let i = 0; i < arr.length; i++) {
for(let k = 0; k < arr.length; k++) {
if(arr[i] < arr[k]) {
let temp = arr[k]
arr[k] = arr[i]
arr[i] = temp
}
}
}
return arr
}
This is very similar to Bubble Sort except that, for each iteration, we track the lowest value found, and pre-append it to the beginning of our list after any other sorted values. The value we replace is re-inserted into the pre-appended value's old location. O(n^2) as well.
Selection Sort falls short in almost sorted arrays because we iterate through every value even if we have already encountered the lowest value to rearrange.
var selectionSort = (arr) => {
const swapHelper = (arr, swapId1, swapId2) => {
([arr[swapId1], arr[swapId2]] = [arr[swapId2], arr[swapId1]])
}
for(let i = 0; i < arr.length; i++) {
let lowest = i
for(let k = i + 1; k < arr.length; k++) {
// Is this still the lowest index?
if(arr[k] < arr[lowest]) {
lowest = k
}
}
// If this lowest index no longer matches our initial lowest, we swap.
swapHelper(arr, i, lowest)
}
return arr
}
Insertion Sort is a blend of Bubble and Selection Sort. We iterate forward through our array, but we reserve a previous portion of the array to sort into.
- We assume position 0 is sorted and begin at position 1.
- For each subsequent value, we take that value, and walk back through our sorted portion to determine when this value is less than the current sorted index value.
- As we walk, we take the current sorted index value and copy it ahead because we know we will need to make room for this value's placement.
- When we find a place, we stop and overwrite the current value. This is safe, because we already copied the previous value to the next index.
This is a little confusing to follow via text and I would recommend writing it out by hand.
Insertion Sort, despite still having the same Big O as the others, still performs better with randomized lists than the previous two sorting algorithms.
var insertionSort = (arr) => {
// For each element in our array,
// find out where it belongs in the sorted portion of our array
// 0 is already sorted, so we skip to 1.
for(let i = 1; i < arr.length; i++) {
let thisValue = arr[i]
let endOfSortedIndex = i - 1
// We iterate backwards through the sorted part of our array
// Until we finish or we encounter a value that's greater than this one
for(var k = endOfSortedIndex; k >= 0 && arr[k] > thisValue; k--) {
arr[k + 1] = arr[k] // Shift up/make room for our inevitable placement.
// It's okay because we stored thisValue in our outer loop.
}
arr[k + 1] = thisValue // We found out where this belongs, so place it.
}
return arr
}
These algorithms are much faster than the basic sorting algorithms but are much more complex. The previous sorting algorithms do not scale well. These sorting algorithms are O(n log n) which is significantly faste than O(n^2).
Merge Sort is a combination of two algorithms: Merging and Sorting. It exploits the fact that arrays sized 0 or 1 element are always sorted. It works by decomposing an array into smaller arrays of 0 or 1 elements, then building up a newly sorted array. This is the Divide and Conquer pattern.
Before we do a full Merge Sort algorithm we can break it down into an even simpler problem; merging two sorted lists. Merge sort will require us, in addition to ordering values, to also merge the contents of both lists.
var mergeSortedArrays = (arr1, arr2) => {
let result = []
let i = 0
let k = 0
while(i < arr1.length && k < arr2.length) {
// Determine who has the lower value.
if(arr2[k] > arr1[i]) {
result.push(arr1[i])
i++ // We move the pointer because this number has been accounted for.
} else {
result.push(arr2[k])
k++ // Same. Move the pointer.
}
// If one list runs out of values before the other, we just concatenate the other.
if (i >= arr1.length) {
let remaining = arr2.slice(k, arr2.length)
result = result.concat(remaining)
}
if (k >= arr2.length) {
remaining = arr1.slice(k, arr1.length)
result = result.concat(remaining)
}
}
return result
}
There was a reason we coded a merge sorted arrays function first; it's actually the bulk of the merge operation itself. All merge sort does is divide two arrays into smaller subarrays until there is only one value remaining. Then, we merge them using our merge sorted arrays function; it's just one single value against another to begin with. We take that merged array and pass it up the call stack and keep merging it with the results from our parallel merge calls.
var mergeSort = (arr) => {
if(arr.length <= 1) return arr // Base Case
let mid = Math.floor(arr.length / 2)
let left = mergeSort(arr.slice(0, mid))
let right = mergeSort(arr.slice(mid, arr.length))
// Now that we have two sorted arrays, we can merge them.
return mergeSortedArrays(left, right)
}
var mergeSortedArrays = (arr1, arr2) => {
let result = []
let i = 0
let k = 0
while(i < arr1.length && k < arr2.length) {
// Determine who has the lower value.
if(arr2[k] > arr1[i]) {
result.push(arr1[i])
i++ // We move the pointer because this number has been accounted for.
} else {
result.push(arr2[k])
k++ // Same. Move the pointer.
}
// If one list runs out of values before the other, we just concatenate the other.
if (i >= arr1.length) {
let remaining = arr2.slice(k, arr2.length)
result = result.concat(remaining)
}
if (k >= arr2.length) {
remaining = arr1.slice(k, arr1.length)
result = result.concat(remaining)
}
}
return result
}
Quick sort is one of the most efficient sorting algorithms, but it is also one of the more complex to understand. It is O(n log n)
- It assumes that arrays of length 0 or 1 are sorted.
- It works by selecting one element, called the pivot, and finding the index where the pivot should end up in the sorted array.
- Once the pivot is positioned appropriately, quick sort can be applied on either side of the pivot.
Pivot selection does affect run time. Picking the median value is the most efficient, but we do not always know where this value is. Picking the middle, first, or last serve as simpler alternatives.
For simplicity, the following examples will use the first element, though there are run time consequences.
In order to implement quick sort sort, its useful to first implement a function responsible for arrangling elements in an array on either side of a pivot. This is similar to merge sort's merge sorted array helper.
- Given an array, this helper should designate an element as the pivot.
- It should then rearrange elements in the array so that all values less than the pivot are moved to the left of the pivot.
- All values greater than (and equal to) the pivot are moved to the right of the pivot.
- The order on either side does not matter!
The helper function should do this in place; it should not create a new array. This means we need to track indices and swap as needed.
When complete, we return the index of the pivot.
let arr = [5, 2, 1, 8, 4, 7, 6, 3]
pivot(arr) // Returns pivot index 4
We may get any of these array mutations back where p is our pivot in the 4th position:
p
[2, 1, 4, 3, 5, 8, 7, 6]
[1, 2, 3, 4, 5, 8, 6, 7]
[4, 1, 3, 2, 5, 7, 8, 6]
But it doesn't matter, because the pivot was respected; all elements less than the 4th position are less than the 4th value of 5. All elements greater are greater than (or equal to) the 4th value of 5.
As we use the pivot helper, the pivot index will change, and eventually all values will be in their respective positions as every value gets its chance to be the pivot.
The function will accept three arguments:
- An array
- A start index (default = 0)
- An end index (default = array length + 1)
We treat the first element as our pivot. We also store this pivot index in a variable and we use this to determine where the pivot (and value) should end up.
Then, we loop through the array from beginning to end:
If the pivot is greater than our current element, increment the pivot index variable and swap the current element with the element at the pivot index.
Once finished, we swap the starting element, the original pivot, with the new pivot index, where it was meant to end up, because all values before and after are in their respective lesser than and greater than positions.
var pivotHelper = (arr, start = 0, end = arr.length + 1) => {
let pivot = arr[start] // The value we compare all values to
let swapIndex = start // Where we swap our pivot at the end
// Here, we skip the pivot and into the unknown values.
for(let i = start + 1; i < arr.length; i++) {
if(pivot > arr[i]) {
// We have found something that is less than our pivot
// We need to change our final pivot swap location to make room for this data
swapIndex++
// The current value gets moved far left
// We increment our swap index so we know where the partition between
// greater than and lesser than is.
arr = swap(arr, i, swapIndex)
}
}
arr = swapHelper(arr, start, swapIndex) // Place our pivot
// console.log(arr) // DEBUG
return swapIndex
// Returns the pivot index
}
var swap = (arr, swapId1, swapId2) => {
([arr[swapId1], arr[swapId2]] = [arr[swapId2], arr[swapId1]])
return arr
}
Now that we have a pivot helper, we can implement the remainder of quick sort.
Our primary function has a few steps:
- We call the pivot helper on the array
- When the helper returns the updated pivot index, we recursively call the pivot helper on the subarrays left and right of that index.
var quickSort = (arr, left = 0, right = arr.length - 1) => {
if(left < right) {
let pivotIndex = pivotHelper(arr, left, right) // Define a partition index
quickSort(arr, left, pivotIndex - 1)
quickSort(arr, pivotIndex + 1, right)
}
return arr
}
var pivotHelper = (arr, start = 0, end = arr.length + 1) => {
let pivot = arr[start] // The value we compare all values to
let swapIndex = start // Where we swap our pivot at the end
// Here, we skip the pivot and into the unknown values.
for(let i = start + 1; i < arr.length; i++) {
if(pivot > arr[i]) {
// We have found something that is less than our pivot
// We need to change our final pivot swap location to make room for this data
swapIndex++
// The current value gets moved far left
// We increment our swap index so we know where the partition between
// greater than and lesser than is.
arr = swap(arr, i, swapIndex)
}
}
arr = swap(arr, start, swapIndex) // Place our pivot
// console.log(arr) // DEBUG
return swapIndex
// Returns the pivot index
}
var swap = (arr, swapId1, swapId2) => {
([arr[swapId1], arr[swapId2]] = [arr[swapId2], arr[swapId1]])
return arr
}
Radix sort is engineered to work on lists of integer values and does not make comparisons between elements. Instead, it exploits the fact that information about the size of the number is encoded in the number of digits. More digits mean a bigger number.
- We take a list of numbers and create different buckets for these numbers. (0 - 9)
- Then, for each digit, starting from the right most digit, we examine it and drop it into a bucket. If a number does not have a digit, we treat it as a zero.
- For each bucket, we reform them into an array of numbers and repeat.
- As we progress, our list gets sorted.
The pseudocode is as follows:
- Define a function that accepts a list of numbers
- Figure out how many digits the largest number has
- Loop from k = 0 up to this largest number of digits
- For each iteration of the loop create buckets for each digit (0 to 9) and place each number in the corresponding bucket basd on its kth digit
- In order (0 bucket through the 9th bucket), replace our existing array with values from each bucket
- Return the sorted list
var radixSort = (numbers) => {
let maxDigitCount = mostDigits(numbers)
for(let k = 0; k < maxDigitCount; k++) {
let digitBuckets = Array.from({length: 10}, () => []) // Creates an array of buckets 0-9
for(let i = 0; i < numbers.length; i++) {
let digit = getDigit(nums[i], k)
digitBuckets[digit].push(numbers[i])
}
}
return numbers
}
var digitCount = (number) => {
if(num === 0) return 1
return Math.floor(Math.log10(Math.abs(num))) + 1
}
var mostDigits = (numbers) => {
let maxDigits = 0
for(let i = 0; i < numbers.length; i++) {
maxDigits = Math.max(maxDigits, digitcount(numbers[i]))
}
return maxDigits
}
Data structures are collections of values, the relationships among them, and the functions or operations that can be applied to the data.
Example Data Structures include:
- Arrays: Basic Arrays, Lists/Vectors
- Binary Search Trees: Where we recursively navigate by considering our options
- Binary Heaps: A tree structure where the parent is always greater than its children
- Queues: Like the line to get groceries during COVID-19
- Stacks: Like a deck of cards
- Linked Lists: The original Blockchain
- Graphs: Directed and Undirected
- Hash Tables: Where keys refer to values
Each data structure has a specific use. Some come with more functionality than others, but depending on the challenge, we may not need or want that functionality. Some are very specialized, others are more general purpose. Data Structures are tools in your programming toolbox and it is about knowing which is the most appropriate for the task at hand.
The 'this' keyword always refers to the object created from that class. Therefore, if I have a Class:
class Student {
constructor(firstName, lastName) {
this.firstName = firstName
this.lastName = lastName
this.classesPassed = 0
}
The class has a single constructor that accepts two arguments and assigns those arguments to 'this' object's properties. See my advanced JavaScript notes for more, but for now, just know that this will be a consistent pattern with all of the data structures herein.
Consider an undirected graph: It is a series of nodes with paths that link them together. Unlike a directed graph, you are not required to go any particular direction.
Now consider how you get to work: You start at home, get in your car, and follow a path to an intersection. From there, you make a decision to go the route that leads to your workplace. These are just nodes and paths on a graph. So, when you use a GPS, you are using graph data to solve your problem.
If you have one way streets where you live, you may live in a directed graph, as you cannot legally go back the way you came. An obstacle course with multiple paths may be a better way to describe a directed graph. The aim is to win, but you may be able to choose what obstacles you overcome to get there. Perhaps life is a directed graph.
Arrays are not made for insertions at the beginning or the end. However, Linked Lists, a different kind of array, are. A head node points to the first node, the second node, and so on. At some point, we reach the end when the final node does not point to anything. So long as we know the length of the linked list, it is easy to make insertions at the end. Making an insertion at the beginning is as easy as creating a new head node and pointing it at the old one.
In fact, linked lists are how we can construct one kind of tree, except that each node can point to multiple child nodes.
If you are web scraping a URL you may need a tree to store your data; all the various paths that the website takes you. Imagine each node, similar to the Linked List example, except it not only points to a destination, but tells you more about itself in the process.
Need to write a task scheduler with a priority queue for an app? A binary heap may be what you need. As certain tasks are created, they may need to be sorted into their most appropriate position so that they are acknowledged as quickly as possible.
Every data structure has real world applications and the sooner you start looking at the world around you with the lens of a computer scientist you will see these patterns everywhere.
Linked lists consist of nodes that start with a head node. Each node knows what data it contains and where the next node in the list is. Doubly linked lists also know where the previous node lives. In the real world, we can relate this to a 5 story building without an elevator. In order to get to the 5th floor, we need to walk up through floors 1-4; we cannot simply skip to floor 5.
The linked list data structure contains, at least, a head property. It may also contain a tail and a length property. The length property can be used to make unsorted insertions at a specific location without needing to do comparisons.
Each node in a linked lists contains, at the very least, a pointer to the next item in the linked list. It likely also contains some unique data.
- Insertion: O(1)
- Removal: It depends. O(1) or O(n)
- Searching: O(n)
- Access: O(n)
- Do not have indices
- Are connected via nodes with pointers; at the very least forwards, sometimes also backwards
- Random access is not allowed; we start at the beginning, or end, and move linearly
- Indexed in order
- Insertion and deletion can be expensive as we have to shift the entire array
- Can quickly be accessed at a specific index
In sum, we use Linked Lists when we are doing a lot of insertions and do not require random access. We use arrays when we are not doing a lot of insertions and require specific data access.
Here, we have an ordered Linked List with n values. Again, the first node is called the head node. The final, tail node, points to nothing. This is how we know we have reached the end.
Node(23) -> Node(32) -> Node(56) -> Node(90) -> ... -> Node(n) --> NULL
If we want to insert a node with a value of 45, we would do something like this:
For each node, look ahead and see if the next node's value is greater than our value.
Here, 'v' indicates where our current pointer is. We look at this node's next node and read the value.
If the next node is greater than the node we want to insert, we tell the node we are pointing to to point at the node we are inserting. We tell the node we are inserting to point at the old next node.
Here is our new node:
Node(45) -> NULL
Next, we begin searching for the appropriate place in our linked list for this node.
v No!
Node(23) -> Node(32) -> Node(56) -> Node(90) -> ... -> Node(n) --> NULL
v No!
Node(23) -> Node(32) -> Node(56) -> Node(90) -> ... -> Node(n) --> NULL
v Yes!
Node(23) -> Node(32) -> Node(56) -> Node(90) -> ... -> Node(n) --> NULL
We have found a next node that is greater than the node we want to insert, so we point the current node to the new, inserted node, and point the inserted node at the old next node.
_ -> Node(45) -> _
Node(23) -> Node(32) Node(56) -> Node(90) -> ... -> Node(n) --> NULL
The list has been updated.
Node(23) -> Node(32) -> Node(45) -> Node(56) -> Node(90) -> ... -> Node(n) --> NULL
class LinkedList {
constructor() {
this.head = null
this.tail = null
this.length = 0
}
push(value) {
/// Pushes a node on to the end of the linked list
let node = new Node(value)
if(this.head == null && this.tail == null) {
this.head = node
this.tail = node
} else {
this.tail.next = node
this.tail = node
}
this.length++
return this
}
pop() {
/// Remove item from the end of the list
if(!this.head) return undefined
let current = this.head
let newTail = current
// Make every new node the new tail until we hit the end
while(current.next) { // !== null
newTail = current
current = current.next
}
this.tail = newTail
this.tail.next = null
this.length--
if(this.length == 0) { // Empty list! Reset everything
this.head = null
this.tail = null
}
return this
}
shift() {
/// Remove an item from the beginning of the list
if(!this.head) return undefined
let oldHead = this.head
this.head = this.head.next
this.length--
return this
}
unshift(value) {
/// Make new node the new head
let node = new Node(value)
if(!this.head) {
this.head = node
this.tail = this.head
} else {
node.next = this.head
this.head = node
}
this.length++
return this
}
traverse(node = this.head) {
/// Iterates through the array and prints details
if(node == null) return
node.getDetails()
this.traverse(node.next)
}
get(targetIndex) {
/// Gets a node from a specific index
if(!this.head || targetIndex < 0 || targetIndex > this.length) {
return undefined
}
let i = 0
let currentNode = this.head
while(i < targetIndex) {
currentNode = currentNode.next
i++
}
return currentNode
}
set(targetIndex, value) {
/// Changes the value of a specific index
let targetNode = this.get(targetIndex)
if(targetNode) {
targetNode.data = value
return true
} else {
return false
}
}
insert(targetIndex, value) {
/// Inserts a new node at a specific position
if(targetIndex < 0 || targetIndex > this.length) return undefined // Out of bounds
if(targetIndex === this.length) return this.push(value) // Add to end
if(targetIndex === 0) return this.unshift(value) // Add to beginning
let node = new Node(value)
let previousNode = this.get(targetIndex - 1)
node.next = previousNode.next
previousNode.next = node
this.length++
return this
}
remove(targetIndex) {
/// Removes a new node at a specific position
if(targetIndex < 0 || targetIndex > this.length) return undefined // Out of bounds
if(targetIndex === this.length) return this.pop() // Remove from end
if(targetIndex === 0) return this.shift() // Remove from beginning
let toBeRemovedNode = this.get(targetIndex)
let previousNode = this.get(targetIndex - 1)
previousNode.next = toBeRemovedNode.next
this.length--
return this
}
reverse() {
/// Reverse our linked list
let node = this.head
this.head = this.tail
this.tail = node
let prev = null
let next = null
for(let i = 0; i < this.length; i++) {
next = node.next // To the right of this node
node.next = prev // To the left of this node
// Shift indices for next iteration
prev = node
node = next
}
return this
}
}
class Node {
constructor(data) {
this.data = data
this.next = null
}
getDetails() {
console.log(`Node: ${this.data}`)
}
}
A doubly linked list is a linked list that has nodes that point both forwards and backwards. This way, we can start traversing at either end of the linked list. Differences here include, in addition to a next node pointer, managing that previous node pointer.
Also, the get node method has some additional logic that determines if it would be better to start at the beginning or end of the list to access the target node. This means searching for a node can occur in O(log n) time.
class LinkedListD {
constructor() {
this.head = null
this.tail = null
this.length = null
}
push(value) {
let node = new Node(value)
if(!(this.head)) {
this.head = node
this.tail = node
} else {
this.tail.next = node
node.prev = this.tail
this.tail = node
}
this.length++
}
pop() {
if(this.length <= 0) return false
newTail = this.tail.prev
newtail.next = null
this.length--
return true
}
shift() {
/// Remove first item
let newHead = this.head.next
newHead.prev = null
this.head = newHead
this.length--
return true
}
unshift(value) {
/// Add item to beginning
let node = new Node(value)
this.head.prev = node
node.next = this.head
this.head = node
this.length++
}
traverse() {
/// Prints the entire linked list
if(!(this.head)) return null
let printedList = "o"
let thisNode = this.head
if(this.head.next == null) {
printedList += this.head.print()
} else {
while(thisNode) {
printedList += thisNode.print()
thisNode = thisNode.next
}
}
printedList += "o"
return printedList
}
get(targetIndex) {
if(!(this.head) || !(this.tail)) return null
if(targetIndex < 0 || targetIndex > this.length - 1) return null
if(targetIndex === 0) return this.head
if(targetIndex === this.length - 1) return this.tail
let currentNode
let i
if(targetIndex < Math.floor(this.length / 2)) {
currentNode = this.head
i = 0
while(i < targetIndex) {
currentNode = currentNode.next
i++
}
} else {
currentNode = this.tail
i = this.length - 1
while(i > targetIndex) {
currentNode = currentNode.prev
i--
}
}
return currentNode
}
set(index, value) {
let thisNode = this.get(index)
thisNode.value = value
return thisNode
}
insert(index, value) {
if(!(this.head) || !(this.tail)) return null
if(index < 0 || index > this.length - 1) return null
let thisNode = new Node(value)
let nextNode = this.get(index)
let prevNode = nextNode.prev
thisNode.prev = prevNode
thisNode.next = nextNode
prevNode.next = thisNode
nextNode.prev = thisNode
this.length++
return thisNode
}
remove(index) {
if(!(this.head) || !(this.tail)) return null
if(index < 0 || index > this.length - 1) return null
let toBeDeletedNode = this.get(index)
let nextNode = toBeDeletedNode.next
let prevNode = toBeDeletedNode.prev
prevNode.next = nextNode
nextNode.prev = prevNode
this.length--
return true
}
reverse() {
/// Reverse the order of the linked list.
// Swap the head and the tail
let oldHead = this.head
this.head = this.tail
this.tail = oldHead
let currentNode = this.head
for(let i = 0; i < this.length; i++) {
// Reverse the next and previous pointers
let nextNode = currentNode.prev
currentNode.next = currentNode.prev
currentNode.prev = nextNode
currentNode = nextNode
}
return this
}
}
class Node {
constructor(value) {
this.value = value
this.prev = null
this.next = null
}
print() {
return `<[${this.value}]>`
}
}
Stacks are data structures that abide by a last in, first out paradigm (LIFO). They only have functions that allow for this limited interaction. We can implement a stack using a linked list, a basic array, or a language library List structure (called a 'vector' in C++).
- Insertion: O(1)
- Removal: O(1)
- Searching: O(n), but not really the point
- Access: O(n), but not really the point
Imagine you are scooping ice cream on to a cone. We place the scoops one by one, where the first scoop is at the bottom and the final one is at the top.
_____
\===/
\-/
___
(-1-)
\===/
\-/
___
(-2-)
(-1-)
\===/
\-/
___
(-3-)
(-2-)
(-1-)
\===/
\-/
Now, how would you eat this ice cream? Chances are, you would not eat the first one, because it would cause the other scoops to fall off. Likewise, eating the second scoop would cause the third to fall off. Instead, we eat them in the reverse order they were placed.
___
(-3-)
(-2-)
(-1-)
\===/
\-/
___
(-2-)
(-1-)
\===/
\-/
___
(-1-)
\===/
\-/
_____
\===/
\-/
This is how we interact with stacks. We can also imagine a deck of cards or a stack of heavy books. As we pile it up, the most recent addition gets removed first.
You may recall the word stack from the recursion section. The run time call stack is not named coincidentally; the run time environment executes the most recently requested command first before acknowledging previous commands.
As changes are made, we push those changes on to a stack of States. If we want to undo a change, we pop off the most recent State and revert to the previous one. That popped recent State may be added to a Redo Stack that records, in order, the most recently undone actions. This is an example of two stacks working in tandem.
As we visit different pages, we push the most recent page on to a stack. If we want to go back to a previous page, similar to the undo/redo functionality, we press back, and the this page is removed from the stack and we load the previous page. You may infer what 'Forward' does in this case, too!
Similar to Linked Lists, we have to interact with the stack. We are only interested in interacting with the top of the stack. If we were to review our linked list implementation, we can use what we have, but a little differently.
Consider push, pop, shift, and unshift. Both push and pop require us to work with the tail end of our linked list. If it is not a doubly linked list, we have to iterate all the way through to the end. But what about shift and unshift? We always start at the head, and we are either adding a node or removing one. We know where the head leads, and we do not need a previous pointer. So, we can use a single linked list implementation and re-imagine what shift and unshift mean
- shift() reversed = pop()
- unshift() reversed = push()
The beginning is actually our end, so we just think of the end of the list as the head and the beginning of the list as our tail. If this is challenging to imagine, revisit the single Linked List implementation and remove the push and pop functionality, as this may make more sense.
Below, I have implemented a Linked List version of a Stack.
JavaScript arrays have built in stack functionality. We do not need to create our own. However, we must adhere to the LIFO Stack property in order for it to be a legitimate Stack. Therefore, we must choose to exclusively use push(), pop() or instead use unshift(), shift(). We cannot use both.
// JavaScript Basic Array Stack Functionality
let array1 = []
array1.push("Oldest") // ["Oldest"]
array1.push("Newest") // ["Oldest", "Newest"]
array1.pop() // ["Oldest"]
let array2 = []
array2.unshift("Oldest") // ["Oldest"]
array2.unshift("Newest") // ["Newest", "Oldest"]
array2.shift() // ["Oldest"]
// Custom Stack Implementation
class Stack {
constructor() {
this.head = null
this.tail = null
this.length = 0
}
pop() { // Works like Shift
if(!this.head) return false
let oldHead = this.head
this.head = oldHead.next
this.length--
return oldHead
}
push(value) { // Works like Unshift
let newHead = new Node(value)
if(!this.head || !this.tail) {
this.head = newHead
this.tail = newHead
return true
}
newHead.next = this.head
this.head = newHead
this.length++
return true
}
}
class Node {
constructor(value) {
this.value = value
this.prev = null
this.next = null
}
print() {
return `<[${this.value}]>`
}
}
Queues, opposite Stacks, prioritize a first in, first out (FIFO) pattern.
- Insertion: O(1)
- Removal: O(1)
- Searching: O(n), but not really the point
- Access: O(n), but not really the point
When you are in line at the grocery store waiting for your turn to check out, the person at the front of the line gets checked out first, then the person after, then the person after, until it is finally your turn. A queue is the same as a queue or line in real life.
<- [1] <- [2] <- [3]
v
<- [1] <- [2] <- [3]
v
<- [2] <- [3]
v
<- [3] <- [4] <- [5]
Just as the queue seems to be finishing, two more items are added.
Queues are used to prioritize tasks, download or upload resources, print pages in order,
JavaScript arrays have built in queue functionality. We do not need to create our own. However, we must adhere to the FIFO Queue property in order for it to be a legitimate Queue. Therefore, we must choose to exclusively use shift(), push() or instead use pop(), unshift(). We cannot use both.
// JavaScript Basic Array Queue Functionality
let array1 = []
array1.push("Oldest") // ["Oldest"]
array1.push("Newest") // ["Oldest", "Newest"]
array1.shift() // ["Newest"]
let array2 = []
array2.unshift("Oldest") // ["Oldest"]
array2.unshift("Newest") // ["Newest", "Oldest"]
array2.pop() // ["Newest"]
// Custom Queue Implementation
class Queue {
constructor() {
this.head = null
this.tail = null
this.length = null
}
push(value) {
let node = new Node(value)
if(!(this.head)) {
this.head = node
this.tail = node
} else {
this.tail.next = node
node.prev = this.tail
this.tail = node
}
this.length++
}
shift() {
/// Remove first item
let node = this.head
let newHead = this.head.next
newHead.prev = null
this.head = newHead
this.length--
return node
}
}
class Node {
constructor(value) {
this.value
this.next
this.prev
}
}
Trees feature a root value and at least two child nodes. Each node points to 2+ children, even if one or both of those children are null.
We use trees to navigate file systems, the domain object model (DOM), text auto complete, network routing, artifical intelligence, and more.
Parents can have multiple children. Binary trees, with two children each, are the most common.
These trees are ordered so that you can make decisions about where to search for specific data based on each node's data. The left child is usually lesser while the right is greater. If we are looking for some data, we can determine what direction to proceed in order to find that data.
Balanced trees are an even height across all branches. Unbalanced trees may be left or right heavy.
AVL trees are self balancing. The height of all branches does not exceed more than 1 in difference because, on insert, the tree rebalances itself by swapping nodes.
Inserting and searching trees usually occurs in O(log n) time.
However, there are many different kinds of trees and each as its own shape properties. If the tree does not auto balance itself, we may have an unbalanced tree where the left children are minimal and the right are expansive. In this case, we end up searching in O(n) time as we do not make full use of a tree's organizational properties. One may think they have built a tree, but if it does not meet the organizational expectations of one, it is easy to create what amounts to just another linked list e.g. if all Nodes only have one child despite being able to have two.
This is a recursive insertion function. We start by passing in the value and making the default node value the root.
- If there is no root, the root becomes this new node.
- If this value is greater than the parent's value, we recursively call insert on the right child. If less, we use the left child. However, if the desired child does not exist, we insert this node as that child.
This is a recursive search function. We start by passing in the value and making the default node value the root. It is very similar to insert() but we keep searching until the current node matches our value or we have run out of nodes. We report the result of that match.
class Node {
constructor(value) {
this.value = value
this.left = null
this.right = null
}
print() {
return `${this.left} <- [${this.value}] -> ${this.right}`
}
}
class BinarySearchTree {
constructor() {
this.root = null
this.height = null
}
insert(value, node = this.root) {
if(this.root === null) {
this.root = new Node(value)
return this
}
if(value >= node.value) {
if(node.right) { this.insert(value, node.right) }
else {
node.right = new Node(value)
return this
}
}
if(value < node.value) {
if(node.left) { this.insert(value, node.left) }
else {
node.left = new Node(value)
return this
}
}
}
find(value, node = this.root) {
if(!node) return false
if(node.value === value) {
console.dir(node)
return true
}
if(value >= node.value) this.find(value, node.right)
else this.find(value, node.left)
}
}
We can either traverse a tree using breadth first search or depth first search.
Note: These examples do not stop at a specific value. Rather, we iterate through the entire tree in order to demonstrate how these different searches traverse trees differently. To modify these for true search functionality, we would pass a target value, and return upon finding that value.
It depends on the scenario.
BFS has a lot more space complexity because of the queue. On large trees, our queue may grow to n length, and we have to store that data. We still navigate it in O(log n) time, but our space complexity is O(n).
On depth first, we are not storing nodes across the tree; we are only considering one branch at a time. However, if it's a very tall/deep/long tree, DFS could potentially take up a lot more space.
In-Order is very common for BSTs because we are interested in traversing until we find something.
Pre-Order is useful to export a tree structure so that it can be reconstructed or copied.
When we BFS traverse, we touch each node in order, per level, before going to the next level. Imagine a dungeon crawling video game where you delve deeper into the dungeon as you progress. An efficient hero would not go deeper into the dungeon without exploring all of the resources on the current level. This is how BFS works.
- Create a queue to store nodes. Create an output variable; an array of nodes, a string, etc.
- Enqueue root on to the queue.
- For each node in the queue:
- Dequeue this node
- Add this node to our output
- If this node has a left child, queue it. If this node has a right child, queue it.
As you can see, the loop will continue until we stop adding nodes to the queue. This also assures that we access each node on a level before accessing the next level. Imagine this tree:
1
2 3
4 5 6 7
- We acknowledge 1, the root, first
- 1 has two children: 2 and 3
- We enqueue 2 and 3
- We remove 1 from the queue and add it to the output
- Acknowledge 2: Add 4 and 5 to the queue
- Acknowledge 3, add 6 and 7 to the queue.
- 4, 5, 6, 7 have no children, so we add them to the output, and we finish
breadthFirstSearch() {
let queue = []
let output = []
let currentNode = this.root // Always start with the root
queue.push(currentNode)
while(queue.length) {
currentNode = queue.shift() // Dequeue
if(currentNode.left) queue.push(currentNode.left)
if(currentNode.right) queue.push(currentNode.right)
output.push(currentNode) // Add to BFS ordered output
}
return output
}
In DFS, we descend to the deepest, left most part of the tree before coming back up and accessing the next, deepest most level.
Imagine Snow White, of Snow White in the seven dwarves, having to navigate the dwarf caves without a torch. Snow White places her right hand on the wall and follows it all the way until she gets back to the entrance. This is an example of DFS.
The only difference between all three strategies is the point in which we push the current/parent node on to our output:
- Pre-Order: First
- Post-Order: Last
- In Order: In between
In the function examples, this line has been aggressively commented so that you can appreciate the difference in position. This one rearrangement determines our output.
Unlike BFS, these are easiest to implement recursively.
- Create an output variable and make the starting node the root
- Create a helper function that accepts a node
Helper:
- Push the value of this node to the output
- If the node has a left child, call the helper on that node
- If the node has a right child, call the helper on that node
1
2 3
4 5 6 7
Example Iteration:
- Start with 1. Add to output. 2 left, 3 right. Pursue left first
-
- Add to output. We see 4 and 5. Go with the left first
-
- Add to output. No children, so return
- We're back to 2, so now we proceed with that next call, which was to the right child/5
- No children. Add to output. Return
- Back to 1, so now we proceed with 1's right child of 3
- Add 3, then go left to 6
- Add 6, no children
- Back up to 3, then right to 7
- Add 7 and return out. All done!
dfsPreOrder() {
let output = []
const traverse = (currentNode) => {
output.push(currentNode) // PIVOTAL ORDER CALL
if(currentNode.left) traverse(currentNode.left)
if(currentNode.right) traverse(currentNode.right)
}
traverse(this.root)
return output
}
This is the opposite of pre-order. We just add the current node after visiting all children.
dfsPostOrder() {
let output = []
const traverse = (currentNode) => {
if(currentNode.left) traverse(currentNode.left)
if(currentNode.right) traverse(currentNode.right)
output.push(currentNode) // PIVOTAL ORDER CALL
}
traverse(this.root)
return output
}
Again, In Order is just a slightly modified version of our previous depth first searches. We visit the left children first, then push the parent, then visit the right children.
dfsInOrder() {
let output = []
const traverse = (currentNode) => {
if(currentNode.left) traverse(currentNode.left)
output.push(currentNode) // PIVOTAL ORDER CALL
if(currentNode.right) traverse(currentNode.right)
}
traverse(this.root)
return output
}
The above functions have a lot in common except for where we push the current node on to the output. Below is an example of how we can make one DFS function a little more DRY (Don't Repeat Yourself). By using a parameter, we can ask the function to perform a specific DFS strategy:
dfsAdaptible(order = "pre") {
let output = []
const traverse = (currentNode) => {
if(order == "pre") output.push(currentNode)
if(currentNode.right) traverse(currentNode.right)
if(order == "in") output.push(currentNode)
if(currentNode.left) traverse(currentNode.left)
if(order == "post") output.push(currentNode)
output.push(currentNode)
}
traverse(this.root)
return output
}
Heaps inherit all of the properties of binary search trees. However, they have some additional properties and rules that make them even more unique.
Binary heaps are used to implement Priority Queues which are very commonly used data structures. They are also used with graph traversal algorithms.
- Insertion: O(log n)
- Removal: O(log n)
- Search: O(n)
We can represent a heap using an array. We can calculate the index of any given child, left or right, using the index of the parent.
0
1 2
3 4 5 6
heap = [0, 1, 2, 3, 4, 5, 6]
const getChildren = (i) => { // e.g. 2
let leftIndex = 2i + 1 // = 5
let rightIndex = 2i + 2 // = 6
return [leftIndex, rightIndex]
}
const getParent = (i) { // e.g. 5
return (i-1) / 2 // = 2
}
In a Max Binary Heap parent nodes are always larger than their child nodes.
- Each parent has at most two child nodes
- The value of each parent node is always greater than its child nodes
- In a max binary heap the parent is greater than the children but there are no guarantees between sibling nodes
- A binary heap is as compact as possible. All the children of each node are as full as they can be and left children are filled out first.
That is, a left child will not start using its children until its sibling is also occupied with data.
In a Min Binary Heap parent nodes are always smaller than their child nodes. All of the other rules are similar to the Max Binary Heap rules.
class MaxBinaryHeap {
constructor() {
this.values = []
}
getChildren(i) {
let leftIndex = Math.floor((2 * i) + 1)
let rightIndex = Math.floor((2 * i) + 2)
return [leftIndex, rightIndex]
}
getParent(i) {
if(i === 1) return 0
return Math.floor((i - 1) / 2)
}
swap(id1, id2) {
([this.values[id1], this.values[id2]] = [this.values[id2], this.values[id1]])
}
insert(value) {
this.values.push(value)
const bubbleUp = (i = this.values.length - 1) => {
let parentIndex = this.getParent(i)
let parentValue = this.values[parentIndex]
if(parentValue > value) return
else {
this.swap(i, parentIndex)
if(parentIndex <= 0) return
bubbleUp(parentIndex)
}
}
if(this.values.length > 1) {
bubbleUp()
}
return this.values
}
remove() {
this.swap(0, this.values.length - 1) // Swap root and last node
this.values.pop() // Remove old root
// Re-arrange new root because it is not the max
const sinkDown = (i = 0) => {
let thisValue = this.values[i],
childrenIndices = this.getChildren(i),
leftIndex = childrenIndices[0],
rightIndex = childrenIndices[1],
leftValue = this.values[leftIndex],
rightValue = this.values[rightIndex]
if(!leftValue && !rightValue) return // Bottom of tree
// This value exceeds both children. Stop!
if(thisValue >= leftValue && thisValue >= rightValue) {
return this.values
}
// Pursue the left or the right child?
if(thisValue <= leftValue && leftValue > rightValue) {
this.swap(i, leftIndex)
sinkDown(leftIndex)
} else {
this.swap(i, rightIndex)
sinkDown(rightIndex)
}
}
sinkDown()
return this.values
}
}
Note: This is just the inverse of the Max Binary Heap. Where logic would dictate larger numbers would get promoted closer to the root, we instead promote smaller numbers.
class MinBinaryHeap {
constructor() {
this.values = []
}
getChildren(i) {
let leftIndex = Math.floor((2 * i) + 1)
let rightIndex = Math.floor((2 * i) + 2)
return [leftIndex, rightIndex]
}
getParent(i) {
if(i === 1) return 0
return Math.floor((i - 1) / 2)
}
swap(id1, id2) {
([this.values[id1], this.values[id2]] = [this.values[id2], this.values[id1]])
}
insert(value) {
this.values.push(value)
if(this.values.length <= 1) return
const bubbleUp = (i = this.values.length - 1) => {
let thisValue = this.values[i]
let parent = this.getParent(i)
if(!this.values[parent] || this.values[parent] < thisValue) return
else {
this.swap(i, parent)
bubbleUp(parent)
}
}
if(this.values.length > 1) {
bubbleUp()
}
return this.values
}
remove() {
this.swap(0, this.values.length - 1) // Swap root with last
this.values.pop() // Remove old root
const sinkDown = (i = 0) => {
let childrenIndices = this.getChildren(i),
left = childrenIndices[0],
right = childrenIndices[1]
if(!this.values[left] && !this.values[right]) return this.values // Bottom
// This value exceeds both children. Stop!
if(this.values[i] <= this.values[left] && this.values[i] <= this.values[right]) {
return this.values
}
if(this.values[left] && this.values[left] < this.values[right]) {
this.swap(i, left)
sinkDown(left)
} else {
this.swap(i, right)
sinkDown(right)
}
}
if(this.values.length > 1) {
sinkDown()
}
return this.values
}
}
A data structure where each element has a priority. Elements with higher properities are served before elements with lower priorities. We take higher priority elements out before lower priority ones.
class Node {
constructor(value, priority) {
this.value = value
this.priority = priority
}
}
Either a min heap or a max heap work so long as the priority values are consistent. We could have a small priority range (1 ... 5) or a large one (1 ... n). So long as we clearly define what is higher priority and what is lower priority a heap can be a priority queue.
In a hospital emergency room, patients get prioritized based on the level of their emergency. When I am downloading or updating Steam games, I can choose to prioritize certain games over others. Both of these concepts are examples of a priority queue.
Any time you need to manage data of varying priority we can use a priority queue. We use bubbleUp() and sinkDown() to sort new items as they get enqueued.
Note: Priority queues are an abstract concept; we do not need a heap to accomplish this. We could use an array. As items are added, we could re-arrange them or iterate over the array to find the next highest priority element to process. However, a Heap has optimized functionality that makes it the natural choice for a priority queue.
class PriorityQueue {
constructor() {
this.nodes = []
}
getChildren(i) {
let leftIndex = Math.floor((2 * i) + 1)
let rightIndex = Math.floor((2 * i) + 2)
return [leftIndex, rightIndex]
}
getParent(i) {
if(i === 1) return 0
return Math.floor((i - 1) / 2)
}
swap(id1, id2) {
([this.nodes[id1], this.nodes[id2]] = [this.nodes[id2], this.nodes[id1]])
}
enqueue(value, priority) {
this.nodes.push(new Node(value, priority))
if(this.nodes.length <= 1) return
const bubbleUp = (i = this.nodes.length - 1) => {
let parent = this.getParent(i)
if(!this.nodes[parent] || this.nodes[parent].priority < this.nodes[i].priority) return
else {
this.swap(i, parent)
bubbleUp(parent)
}
}
if(this.nodes.length > 1) {
bubbleUp()
}
return this.nodes
}
dequeue() {
this.swap(0, this.nodes.length - 1) // Swap root with last
let nextToProcess = this.nodes.pop() // Remove old root
const sinkDown = (i = 0) => {
let childrenIndices = this.getChildren(i),
left = childrenIndices[0],
right = childrenIndices[1]
if(!this.nodes[left].priority && !this.nodes[right].priority) return // Bottom
// This value exceeds both children. Stop!
if(this.nodes[i].priority <= this.nodes[left].priority && this.nodes[i].priority <= this.nodes[right].priority) {
return
}
if(this.nodes[left].priority && this.nodes[left].priority < this.nodes[right].priority) {
this.swap(i, left)
sinkDown(left)
} else {
this.swap(i, right)
sinkDown(right)
}
}
if(this.nodes.length > 1) {
sinkDown()
}
return nextToProcess
}
}
class Node {
constructor(value, priority) {
this.value = value
this.priority = priority
}
}
Note: Also called Hash Tables.
Hash tables are used to store key-value pairs. They are like arrays, but the keys are not ordered. They are fast for finding values, adding new values, and removing values, because we merely need to know the key in order to access that key's data.
Nearly every programming language has some sort of hash table data structure. They are very fast and hash tables are used everywhere. For example, we rely on them for the counting problem solving approach.
On average, we're using hash maps we didn't create, and so they have better hashing and collision handling. That is, we rarely have to rely on linear probing or separate chaining. These concepts are explained below.
- Insert: O(1)
- Deletion: O(1)
- Access: O(1)
- Python has Dictionaries (dict)
- JavaScript has Objects (String Keys only) and Maps
- Java, Go, and Scala have Maps
- Ruby has Hashes
However, for the sake of discussion, we will create our own.
Imagine an array of colors:
["#FFFFFF", "#000000"]
These correspond with black and white, but it would be a lot easier if we could use their names instead of their index:
colors["black"]
is much easier to access than
colors[1]
These plain english color names would be our keys and the hex codes would be our values. How do we pair them up? We need human readability but we also need computer readability. Computers do not recognize an index "black".
In order to look up values by key we need a way to convert keys into valid array indices. A function that performs this task is called a hash function.
Imagine we have an infinite amount of potential hash values. Below, we use 0 through 5.
[___] [___] [___] [___] [___] [___]
0 1 2 3 4 5
To hash the color "black", we pass in "black" to our hash function and receive the hypothetical hash 0. Every time we pass in "black", we receive 0, because the hash of "black" should always remain the same.
[___] [___] [___] [___] [___] [___]
0 1 2 3 4 5
^
["black", "#000000"]
For this hash, we store "black" as the key and "#000000" at hash 0. If I ever need to access this information again, I can hash black, retrieve the hashed value, and access the data again. Note: This is why it is so important for hash functions to consistently return the same value.
This is a very small range of hash values, and so if I pass in white, but receive another hashed value of 0, we have a problem: Black's information is already stored at 0. What will happen to White's information? This is called a collision.
Collisions are increasingly rare as hash functions become more complex. However, collisions are still possible, and when they occur, we need to handle them. This is discussed later.
Hash functions accept a string value and return a unique key value. There are organizations all over the world striving to create better, more comprehensive hash functions. People specialize in hash functions.
Disclaimer: Not a cryptographically secure one
- It's fast: Constant time O(1) is the goal
- It doesn't cluster outputs at specific indices, but it does distribute uniformly
- Deterministic: The same output should yield the same output
Hashes almost always use prime numbers to strengthen the hash. We leave the reasoning to the mathematicians, but it is essential for reducing collisions. It helps spread out the keys more uniformly.
const hash = (key, arrayLength) => {
let hash = 0
const unique = 31
for(let i = 0; i < Math.min(key.length, 1000); i++) {
let char = key[i]
let value = char.charCodeAt(0) - 96
hash = (hash * unique + value) % arrayLength
}
return hash
}
Collisions are inevitable. We can handle them in two major ways:
At each index in our array we store values using a more sophisticated data structure such as an array or linked list. This allows us to store multiple key-value pairs at the same index.
If we have two values at the same hash location:
- We hash the value we want and navigate to our hash table
- Knowing each location could have multiple entries, we loop through and find the one we want in what should be a fairly small sub data set
When we detect a collision we search through the array to find the next empty slot and store the data there. Unlike separate chaining, this alllows us to store a single key-value per index, but it does require a little more work.
class HashMap {
constructor(size = 53) {
this.keyMap = new Array(size)
}
_hash(key) {
let hash = 0
const unique = 31
for(let i = 0; i < Math.min(key.length, 100); i++) {
let char = key[i]
let value = char.charCodeAt(0) - 96
hash = Math.abs((hash * unique + value) % this.keyMap.length)
}
return hash
}
set(key, value) {
/// Hashes and store the key-value pair in the hash table
let hashedKey = this._hash(key)
if(!this.keyMap[hashedKey]) this.keyMap[hashedKey] = [] // New entry
this.keyMap[hashedKey].push([key, value])
}
get(key) {
/// Accepts and hashes a key
let chain = this.keyMap[this._hash(key)]
if(chain) {
for(let i = 0; i < chain.length; i++) {
if(chain[i][0] == key) return chain[i][1]
}
}
return undefined // No match
}
getKeys() {
let keys = []
for(let i = 0; i < this.keyMap.length; i++) {
if(this.keyMap[i]) {
for(let k = 0; k < this.keyMap[i].length; k++) {
if(!keys.includes(this.keyMap[i][k][0])) {
keys.push(this.keyMap[i][k][0])
}
}
}
}
return keys
}
getValues() {
let values = []
for(let i = 0; i < this.keyMap.length; i++) {
if(this.keyMap[i]) {
for(let k = 0; k < this.keyMap[i].length; k++) {
if(!values.includes(this.keyMap[i][k][1])) {
values.push(this.keyMap[i][k][1])
}
}
}
}
return values
}
}
A graph data structure consists of a finite (and possibly mutable) set of vertices or nodes or points, together with a set of unordered pairs of these vertices for an undirected graph or a set of ordered pairs for a directed graph.
Basically, it is a set of nodes and connections. Sometimes graphs have specific orders that we can traverse, sometimes we can traverse any way we like. Sometimes they cycle, sometimes they do not. Sometimes certain connections are more costly than others. When we talk about graphs, we may or may not have to consider tons of different properties that make graphs unique.
- Vertex: A node
- Edge: A connection between nodes
Edges can be:
- Unweighted: There is no cost associated with an edge
- Weighted: Traveling along this type of edge carries a cost. Different edges have differrent costs
- Undirected: We can travel both directions
- Directed: We can only go a certain direction
The two standard approaches to storing a graph are an adjaceny matrix or an adjacency list. We can use both to ask questions about a graph:
- How many edges does a vertex have?
- What other vertices does this vertex connect to?
- Is this graph cyclic? Acyclic?
And more, just by examining the stored values without having a tangible "image" of the graph. Think of it as the purest, most minimal representation of a cartographical map; as if you could only use words and numbers, no colors or shapes, to represent it.
Note: I will be using adjacency lists for examples in this guide. Most real world data tends to lend itself to sparser and/or larger graphs. Matrices are better if your data is full of edges.
We can use a 2D array to demonstrate edges. Consider this sad (cycly, even!) looking ASCII graph:
Unweighted, cyclic graph:
_A---B
/ \
F C
\ /
E-----D
We can represent edges using binary values, where 0 means there is not a connection, and 1 means there is a connection:
- A B C D E F
A 0 1 0 0 0 1
B 1 0 1 0 0 0
C 0 1 0 1 0 0
D 0 0 1 0 1 0
E 0 0 0 1 0 1
F 1 0 0 0 1 0
If it were weighted, we would use this matrix to represent edge weights:
Weighted, cyclic graph:
1
8 _A---B
/ \ 3
F C
6 \ / 5
E-----D
2
And here, we see we are no longer using binary values, but values representing their weight:
- A B C D E F
A 0 1 0 0 0 8
B 1 0 3 0 0 0
C 0 3 0 5 0 0
D 0 0 5 0 2 0
E 0 0 0 2 0 6
F 8 0 0 0 6 0
For this, we represent our graph using an array of arrays, where each index is a vertex and its stored values are connections.
Unweighted, cyclic graph:
_0---1
/ \
5 2
\ /
4-----3
Would result in a data structure:
[
0: [1, 5],
1: [0, 2],
2: [1, 3],
3: [2, 4],
4: [3, 5],
5: [4, 0]
]
If our vertices were strings, like the matrix example graph, we could instead use a hash map (table).
- |V|: The number of vertices
- |E|: The number of edges
| Operation | Adjaceny List | Adjaceny Matrix |
|---------------|---------------|-----------------|
| Add Vertex | O(1) | O(|V^2|) |
| Add Edge | O(1) | O(1) |
| Remove Vertex | O(|V| + |E|) | O(|V^2|) |
| Remove Edge | O(|E|) | O(1) |
| Query | O(|V| + |E|) | O(1) |
| Storage | O(|V| + |E|) | O(|V^2|) |
- Can take up less space (in sparse graphs)
- Faster to iterate over all edges
- Can be slower to look up specific edges
- Faster to look up a specific edge
- Takes up more space (in sparse graphs)
- Slower to iterate over all edges
We want to be able to visit, update, and check vertices in graphs. Trees are a kind of graph, except that there is only one path to each node starting from a root node. In graphs, we need to consider all path options and choose the most optimized one. We can start at any vertex that is not a hole.
Note: A hole, in a directed graph, is a vertex that only has paths in, but not out. Opposite, a source is a vertex that only has paths out.
Graph traversal can be used to:
- Peer to peer network
- Web crawl
- Finding the "closest" matches or recommendations
- Shortest Path problems like GPS navigation, solving mazes, and artifical intelligence
Before we can do all of this we simply need to be able to visit every single node.
We can start from any vertex in an undirected graph. We can start from any node with a direction out in a directed graph.
As we visit vertices, we store that vertex in a list of visited vertices. We can check against this list to make sure we do not visit it again.
This is an undirected, acyclic graph. Edges are not weighted and there are dead ends e.g. Sacramento
- If the list does not contain this vertex, we make a new key using this vertex name
- Set the value of this vertex to an empty array
The value will be used to store edges, by name, in the future.
- This function accepts the name of two vertices
- If these avertices exist
- Loop through all vertices and remove every edge that is this vertex. Cleaning up references is critical
- Finally, remove this vertex.
- Pass the names of the two vertices that share an edge
- For each vertex, assign a filtered list of edges that exclude the other vertex to that vertex's value
class Graph {
constructor() {
this.adjacencyList = {}
}
addVertex(v) {
/// Add a node with no connections.
if(!this.adjacencyList[v]) this.adjacencyList[v] = []
}
removeVertex(vertex) {
/// Remove all edges associated with this vertex before removing the vertex
if(this.adjacencyList[vertex]) {
while(this.adjacencyList[vertex].length) {
let adjacent = this.adjacencyList[vertex].pop() // Get each adjacent vertex
this.removeEdge(vertex, adjacent) // Dissolve the edge
}
// Finally, remove this vertex
delete this.adjacencyList[vertex]
}
}
addEdge(v1, v2) {
/// If these vertices exist, and edges are not already present, add edges
if(!this.adjacencyList[v1] || !this.adjacencyList[v2]) return false
if(!this.adjacencyList[v1].includes(v2)) {
this.adjacencyList[v1].push(v2)
}
if(!this.adjacencyList[v2].includes(v1)) {
this.adjacencyList[v2].push(v1)
}
return true
}
removeEdge(v1, v2) {
if(!this.adjacencyList[v1] || !this.adjacencyList[v2]) return false
if(this.adjacencyList[v1].includes(v2)) {
this.adjacencyList[v1] = this.adjacencyList[v1].filter(v => v !== v2)
}
if(this.adjacencyList[v2].includes(v1)) {
this.adjacencyList[v2] = this.adjacencyList[v2].filter(v => v !== v1)
}
return true
}
depthFirstTraversal(start) {
/// Given a vertex, navigate to the next, non-visited connected vertex
const result = []
const visited = {}
const list = this.adjacencyList // Scope
const traverse = (vertex) => {
if(!vertex) return null
visited[vertex] = true
result.push(vertex)
for(let e in list) {
if(!visited[e]) return traverse(e)
}
return result
}
traverse(start)
return result
}
breadthFirstTraversal(start) {
/// Given a vertex, add all connected vertices to our queue
if(!start) return null
const queue = [start]
const result = []
const visited = {}
let current
while(queue.length) {
current = queue.shift()
visited[current] = true
result.push(current)
this.adjacencyList[current].forEach(connectedVertex => {
if(!visited[connectedVertex]) {
visited[connectedVertex] = true
queue.push(connectedVertex)
}
});
}
return result
}
}
Finds the shortest path between two vertices on a graph. What's the fastest, most cost effective way to get from point A to point B?
- Every time we want to visit a new node, we pick the vertex with the smallest known weight. We can use a priority queue to track this
- Once we've moved to the node we're going to visit, we look at each of the node's adjacent vertices
- For each adjacent vertex, we calculate the weight by summing up the total edges that lead to the vertex we're checking from the starting vertex
- If the new total weight to a vertex is less than the previous total, we store the shorter weight for that node
The benefit is that we only need to run this algorithm once to find the shortest path between any two nodes. Then, so long as the graph has not changed, we can use the data to find the shortest path between any two points.
As we iterate over each vertex, we can store connected vertices, by priority, in a queue. As they get enqueued they will bubble up to their appropriate position. As the next connected vertex gets dequeued, we sink it down and re-arrange to find a new next vertex.
Recall that, in the initial concept, every time we need to visit a new node, we want to first visit the node with the lowest weight. As we discover new shortest paths between each node, we want to update that node's weight to reflect that new shortest path. Then, we update the priority queue so that we are always dequeueing the node with the lowest weight.
class PriorityQueue {
constructor() {
this.nodes = []
}
getChildren(i) {
let leftIndex = Math.floor((2 * i) + 1)
let rightIndex = Math.floor((2 * i) + 2)
return [leftIndex, rightIndex]
}
getParent(i) {
if(i === 1) return 0
return Math.floor((i - 1) / 2)
}
swap(id1, id2) {
([this.nodes[id1], this.nodes[id2]] = [this.nodes[id2], this.nodes[id1]])
}
enqueue(value, priority) {
this.nodes.push(new Node(value, priority))
if(this.nodes.length <= 1) return
const bubbleUp = (i = this.nodes.length - 1) => {
let parent = this.getParent(i)
if(!this.nodes[parent] || this.nodes[parent].priority < this.nodes[i].priority) return
else {
this.swap(i, parent)
bubbleUp(parent)
}
}
if(this.nodes.length > 1) {
bubbleUp()
}
return this.nodes
}
dequeue() {
this.swap(0, this.nodes.length - 1) // Swap root with last
let nextToProcess = this.nodes.pop() // Remove old root
const sinkDown = (i = 0) => {
let childrenIndices = this.getChildren(i),
left = childrenIndices[0],
right = childrenIndices[1]
if(!this.nodes[left].priority && !this.nodes[right].priority) return // Bottom
// This value exceeds both children. Stop!
if(this.nodes[i].priority <= this.nodes[left].priority && this.nodes[i].priority <= this.nodes[right].priority) {
return
}
if(this.nodes[left].priority && this.nodes[left].priority < this.nodes[right].priority) {
this.swap(i, left)
sinkDown(left)
} else {
this.swap(i, right)
sinkDown(right)
}
}
if(this.nodes.length > 1) {
sinkDown()
}
return nextToProcess
}
}
class Node {
constructor(value, priority) {
this.value = value
this.priority = priority
}
}
The function accepts a starting and ending vertex. We want to find the shortest path between these two vertices. However, once we have the result, we can use it to find other results, too, so we can save the data.
-
Create an object (distances) and set each key to be every vertex in the adjacency list with a value of Infinity. The start vertex should have a value of 0 (since it takes no effort to get there)
-
Add each vertex with a priority of Infinity to the priority queue
-
Cretae another object called previous. Each key, by vertex, should have a value of null, because we have not determined any connections yet
-
While there is anything in the priority queue ...
-
Dequeue a vertex from the priority queue
-
If that vertex is the same as the ending vertex, we are done!
-
Otherwise, loop through each edge in the adjacency list at that vertex
-
Calculate the distance to that vertex from the starting vertex
-
If the distance is less than what is currently stored in our distances object ...
-
Update the distances object with the new, lower distance
-
Update the previous object to contain that vertex
-
Enqueue the vertex with the total distance from the start node
-
class PriorityQueue {
constructor() {
this.nodes = []
}
getChildren(i) {
let leftIndex = Math.floor((2 * i) + 1)
let rightIndex = Math.floor((2 * i) + 2)
return [leftIndex, rightIndex]
}
getParent(i) {
if(i === 1) return 0
return Math.floor((i - 1) / 2)
}
swap(id1, id2) {
([this.nodes[id1], this.nodes[id2]] = [this.nodes[id2], this.nodes[id1]])
}
enqueue(value, priority) {
this.nodes.push(new Node(value, priority))
if(this.nodes.length <= 1) return
const bubbleUp = (i = this.nodes.length - 1) => {
let parent = this.getParent(i)
if(!this.nodes[parent] || this.nodes[parent].priority < this.nodes[i].priority) return
else {
this.swap(i, parent)
bubbleUp(parent)
}
}
if(this.nodes.length > 1) {
bubbleUp()
}
return this.nodes
}
dequeue() {
this.swap(0, this.nodes.length - 1) // Swap root with last
let nextToProcess = this.nodes.pop() // Remove old root
const sinkDown = (i = 0) => {
let childrenIndices = this.getChildren(i),
left = childrenIndices[0],
right = childrenIndices[1]
if(!this.nodes[left] && !this.nodes[right]) return // Bottom
// If this value exceeds the children, stop:
// Left exists, but not right
if(this.nodes[left] && !this.nodes[right]) {
if(this.nodes[i].priority <= this.nodes[left].priority) return
// Right exists, but not left
} else if(this.nodes[right] && !this.nodes[left]) {
if(this.nodes[i].priority <= this.nodes[right].priority) return
// Both exist
} else if(this.nodes[i].priority <= this.nodes[left].priority && this.nodes[i].priority <= this.nodes[right].priority) {
return
}
if(this.nodes[left].priority && this.nodes[left].priority < this.nodes[right].priority) {
this.swap(i, left)
sinkDown(left)
} else {
this.swap(i, right)
sinkDown(right)
}
}
if(this.nodes.length > 1) {
sinkDown()
}
return nextToProcess
}
}
class Node {
constructor(value, priority) {
this.value = value
this.priority = priority
}
}
class WeightedEdge {
constructor(name, weight = 0) {
this.name = name
this.weight = weight
}
}
class WeightedGraph {
constructor() {
this.adjacencyList = {}
}
addVertex(vertex, weight) {
if(!this.adjacencyList[vertex]) this.adjacencyList[vertex] = []
}
removeVertex(vertex) {
// Remove from all edges
if(!this.adjacencyList[vertex]) return
while(this.adjacencyList[vertex].length) {
let connectedVertex = this.adjacencyList[vertex].pop()
this.removeEdge(vertex, connectedVertex)
}
delete this.adjacencyList[vertex]
}
addEdge(v1, v2, weight = 0) {
if(!this.adjacencyList[v1] || !this.adjacencyList[v2]) return false
if(!this.adjacencyList[v1].includes(v2)) this.adjacencyList[v1].push(new WeightedEdge(v2, weight))
if(!this.adjacencyList[v2].includes(v1)) this.adjacencyList[v2].push(new WeightedEdge(v1, weight))
return true
}
removeEdge(v1, v2) {
if(!this.adjacencyList[v1] || !this.adjacencyList[v2]) return false
this.adjacencyList[v1] = this.adjacencyList[v1].filter(v => v.name !== v2)
this.adjacencyList[v2] = this.adjacencyList[v2].filter(v => v.name !== v1)
return true
}
depthFirstTraversal(startVertex) {
const result = []
const visited = {}
const list = this.adjacencyList // Scope
const traverse = (vertex) => {
if(!vertex) return null // Done
visited[vertex] = true
result.push(vertex)
for(let connectedVertex in list) {
if(!visited[connectedVertex]) return traverse(connectedVertex)
}
return result
}
traverse(startVertex)
return result
}
breadthFirstTraversal(startVertex) {
if(!startVertex) return null
const result = []
const visited = {}
const queue = [startVertex]
let current
let currentObj
while(queue.length) {
current = queue.shift()
visited[current] = true
result.push(current)
currentObj = this.adjacencyList[current] // By name
currentObj.forEach(connected => {
if(!visited[connected.name]) {
visited[connected.name] = true
queue.push(connected.name)
}
})
}
return result
}
dijkstras(start, finish) {
if(start === finish) return [start]
const distances = {},
previous = {},
visited = {},
queue = new PriorityQueue(),
result = []
/*
* For distance, set each vertex as a key
* The start vertex has a distance of 0
* Every other vertex has a distance of Infinity
*/
for(let vertex in this.adjacencyList) {
if(vertex === start) {
distances[vertex] = 0 // We're already there
} else {
distances[vertex] = Infinity // Unknown distance
}
queue.enqueue(vertex, distances[vertex])
previous[vertex] = null
}
// For every vertex, we need to consider the edges of that vertex
let thisVertex
while(queue.nodes.length){
thisVertex = queue.dequeue().value
if(thisVertex === finish) { // Done
break
}
// If not visited/has some known distance
if(thisVertex || distances[thisVertex] !== Infinity) {
for(let edgeIndex in this.adjacencyList[thisVertex]) {
// Find adjacent node using our edge list
let adjacentVertex = this.adjacencyList[thisVertex][edgeIndex]
// Calculate new distance to adjacent node
let candidate = distances[thisVertex] + adjacentVertex.weight
let adjacentName = adjacentVertex.name
if(candidate < distances[adjacentName]) {
// Update our new best distance to this adjacent vertex
distances[adjacentName] = candidate
// Update previous to show our best path to this adjacent vertex
previous[adjacentName] = thisVertex
// Add this to our priority queue since it is a potential solution
queue.enqueue(adjacentName, candidate);
}
}
}
}
const getResult = () => {
const result = []
const queue = [finish]
// Beginning at the start, while there are items in the queue,
// traverse and concat vertex on to result until we hit the finish
while(queue.length) {
let current = queue.shift()
result.push(current)
if(current !== start) queue.push(previous[current])
}
return result.reverse()
}
return getResult()
}
}
A method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions.
It was actually named by Richard Bellman in the 1940's with regards to optimizing military scheduling. So dynamic programming actually refers to programming military activities; not computer programming. It refers to a multi-stage decision process that may or may not have a lot of time variance. Bellman said it "sounded impressive".
A problem is said to have overlapping subproblems if it can be broken down into subproblems which are reused several times.
The Fibonacci sequence is an example of this. For each value, we need to calculate preceding values:
var fib = (n) => {
if(n <= 2) return 1
return fib(n - 1) + fib(n - 2)
}
Fibonacci does have overlapping sub problems. If we pass in 5, our first recursive calls are to 4 and 3, and so on. 4 will next recursively call 3 and 2 in its instance, while 3 will proceed to call 2 and 1. There is an overlapping subproblem where 3, 2, and 1 all end up being called and processed through the Fibonacci function twice.
Merge Sort does not have overlapping sub problems. We are sorting different pieces every time. There is no logical overlap between problem sets.
A problem is said to have optimal substructure if an optimal solution can be constructed from optimal solutions of its subproblems.