Skip to content

Instantly share code, notes, and snippets.

@sunil-bagde
Last active November 6, 2024 15:20
Show Gist options
  • Save sunil-bagde/dd4a4c57ca7124fbb755fdc772e1c49e to your computer and use it in GitHub Desktop.
Save sunil-bagde/dd4a4c57ca7124fbb755fdc772e1c49e to your computer and use it in GitHub Desktop.
Grokking Algorithm

greedy algorithms | 8

In this chapter

  • Learn how to tackle the impossible NP-complete problem
  • Learn to indetify such problem
  • To find an approximate solution to an NP-complete problem quickly
  • You learn about the greedy strategy, a very simple problem-solving strategy

The classroom scheduling problem

Class    Start    End

art      9am      10am
eng      9:30am   10:0am
math     10am     11am
cs       10:30am  11:30am
music    11am     12pm

you want hold many classes as possible, hwo do ypu pcik what set claases to hold, you get bigest set of class possible

  1. First, pick the class that finishes the earliest.
  2. Then, pick the next class that starts after the first one ends and also finishes the earliest.

Keep doing this, and you’ll end up with the answer!

  1. Art ends the soonest, at 10:00 a.m.
Class    Start    End

art      9am      10am        ✔
eng      9:30am   10:0am
math     10am     11am
cs       10:30am  11:30am
music    11am     12pm
  1. the next class that starts after 10:00 a.m. and ends the soonest. Math works
Class    Start    End

art      9am      10am        ✔
eng      9:30am   10:0am
math     10am     11am.       ✔
cs       10:30am  11:30am
music    11am     12pm
  1. English conflicts with arts. but Math works, finally CS conflicts with math, Music works
Class    Start    End

art      9am      10am        ✔
eng      9:30am   10:0am
math     10am     11am.       ✔
cs       10:30am  11:30am
music    11am     12pm        ✔

The knapsack problem

Suppose you’re a greedy thief.

there are all these items you can steal.

you can only take what you can fit in your knapsack.

The knapsack can hold 35 pounds.

to maximize the value of the items you put in your knapsack.

  1. Pick the most expensive thing that will fit in your knapsack.
  2. Pick the next most expensive thing that will fit in your knapsack. And so on.
STEREO      LAPTOP      GUITAR

$3000       $2000       $1500

30lbs       20lbs       15lbs

Your knapsack can hold 35 pounds of items.

The stereo system (35lbs) is the most expensive, so you steal that. Now you don’t have space for anything else. VALUE $3000

If you’d picked the laptop andthe guitar instead, you could have had $3,500 worth of loot!

15 lbs   -----
GUITAR

20lbs    _____
LATOP

         VALUE $3500

The set-covering problem

aims to cover all required elements with the fewest possible sets from a given collection.

e.g

you starting radion show

you want to reach all 50 states

ou have to decide what station to play on yo reach those all listeners

we are trying to minmize number of station

there is overlap

  1. List every possible subset of stations. This is called the power set. There are 2^n possible subsets.
SET 1       SET 2       STE 3

KONE        KTWENTY     KHUNDRED
KTHREE      KTWENTY     KMIL
KFIVE       KFIFTH      KTHOUSAND
etc...      etc...      etc...
  1. From these, pick the set with the smallest number of stations that covers all 50 states.

to calculate every possible subset of stations. It takes O(2^n) time, because there are 2^n stations


NO. of station          TIME TAKEN

5                          3.2 sec
10                         1024 sec
32                         13.6 years
100                        4*10(^21) years

Approximation algorithms

Greedy algorithms to the rescue!

  1. Pick the station that covers the most states that haven’t been covered yet. It’s OK if the station covers some states that have been covered already.
  2. Repeat until all the states are covered.

This is called an approximation algorithm

Approximation algorithms are judged by

  • How fast they are
  • How close they are to the optimal solution

keys are station names, and the values are the states they cover.

kone station covers Idaho, Nevada, and Utah

const fruits = new Set(["avocado", "tomato", "banana"]);
const vegetables = new Set(["beets", "carrots", "tomato"]);

// Set union joining two or more things into one
const union = new Set([...fruits, ...vegetables]);
console.log(union); // Set { 'avocado', 'tomato', 'banana', 'beets', 'carrots' }

// Set intersection  a set that contains the items that exist in both set x (fruits), and set y (vegetables) 
const intersection = new Set([...fruits].filter(item => vegetables.has(item)));
console.log(intersection); // Set { 'tomato' }

// Set difference
const difference = new Set([...fruits].filter(item => !vegetables.has(item)));
console.log(difference); // Set { 'avocado', 'banana' 

Code for setup

let statesNeeded = new Set(['mt', 'wa', 'or', 'id', 'nv', 'ut', 'ca', 'az']);

const stations = {
  kone: new Set(['id', 'nv', 'ut']),
  ktwo: new Set(['wa', 'id', 'mt']),
  kthree: new Set(['or', 'nv', 'ca']),
  kfour: new Set(['nv', 'ut']),
  kfive: new Set(['ca', 'az']),
};

const finalStations = new Set();

while (statesNeeded.size > 0) {
  let bestStation = null;
  let statesCovered = new Set();

  for (const [station, statesForStation] of Object.entries(stations)) {
    // Find states covered by this station
    const covered = new Set([...statesNeeded].filter(state => statesForStation.has(state)));

    // If this station covers more states than the current best station, select it
    if (covered.size > statesCovered.size) {
      bestStation = station;
      statesCovered = covered;
    }
  }

  // Add the best station to the list of final stations
  if (bestStation) {
    finalStations.add(bestStation);
    // Remove the covered states from the list of needed states
    statesNeeded = new Set([...statesNeeded].filter(state => !statesCovered.has(state)));
  }
}

console.log(finalStations); // Output: Set { 'ktwo', 'kthree', 'kone', 'kfive' }

NP-complete problems

you calculate every possible solution and pick the smallest/shortest one.

e.g The traveling-salesperson problem and the set-covering problem

Traveling Salesperson

1 start city _ 1 route = 1 total route 2 start cities _ 1 route = 2 total routes 3 start cities _ 2 route = 6 total routes 4 start cities _ 6 route = 24 total routes 5 start cities _ 24 route = 120 total routes 6 start cities _ 120 route = 720 total routes

10 start cities * 362,880 route = 3,628,800 total routes

A good approximation algorithm would be to pick an arbitrary starting city and pick the closest non-visited city next.

NP-complete characteristics

  • Your algorithm runs quickly with a handful of items but really slows down with more items
  • All combinations of X usually point to an NP-complete problem
  • Do you have to calculate “every possible version of X because you can’t break it down into smaller sub-problems? Might be NP-complete.
  • If your problem involves a sequence
  • If your problem involves a set
  • Can you restate your problem as the set-covering problem

dynamic programming | 9

In this chapter

  • dynamic programming, a technique to solve a hard problem by breaking it up into subproblems and solving those subproblems first.

  • you learn to how to come up with a dynamic programming solution to a new problem.

The Knapsack Problem

You're a thief with Knapsack capacity of 4lbs. There are 3 items to steal:

  • Stereo $3000 4lbs
  • Laptop $2000 3lbs
  • Guitar $1500 1lbs

The simplest algorithm would be to calculate every possible set of goods. However this doesn't scale, O(2n). The optimal solution can be calculated using dynamic programming.

Dynamic programming

Dynamic programming starts by solving subproblems and builds up to solving the big problem.

Every dynamic-programming algorithm starts with a grid. Here’s a grid for the knapsack problem.

         1     2     3     4
         -------------------
GUITAR   |    |      |     |
         --------------------
STEREO   |    |      |     |
         ___________________
LAPTOP   |    |      |     |
         ___________________

The rows of the grid are the items, and the columns are knapsack weights from 1 lb to 4 lb.

You need all of those columns because they will help you calculate the values of the sub-knapsacks.

The grid starts out empty. You’re going to fill in each cell of the grid. Once the grid is filled in, you’ll have your answer to this problem!

The guitar row

The first cell has a knapsack of capacity 1 lb. The guitar is also 1 lb, which means it fits into the knapsack!

a knapsack of capacity 2 lb. Well, the guitar will definitely fit in there!

this is the firstrow, so you have only the guitar to choose from. You’re pretending that the other two items aren’t available to steal right now

                     1lbs   2lbs   3lbs   4lbs
Guitar $1500 lb      $1500  $1500  $1500  $1500
Stereo $3000 4lbs
Laptop $2000 3lbs

our best guess for thife should steal guitar

The stereo row

Let’s start with the first cell, a knapsack of capacity 1 lb. The current max value you can fit into a knapsack of 1 lb is $1,500.

                     1lbs   2lbs   3lbs   4lbs
Guitar $1500 lb      $1500  $1500  $1500  $1500
Stereo $3000 4lbs    $1500  $1500  $1500  $3000
Laptop $2000 3lbs

You just updated your estimate! If you have a 4 lb knapsack, you can fit at least $3,000 worth of goods in it. You can see from the grid that you’re incrementally updating your estimate.

The laptop row

The laptop weighs 3 lb, won’t fit into a 1 lb or a 2 lb knapsack.

                     1lbs   2lbs   3lbs   4lbs
Guitar $1500 lb      $1500  $1500  $1500  $1500
Stereo $3000 4lbs    $1500  $1500  $1500  $3000
Laptop               $1500  $1500  $2000      $

At 3 lb, the old estimate was $1,500. But you can choose the laptop instead, and that’s worth $2,000. So the new max estimate is $2,000!

you can fit the guitar into that 1 lb space, and that’s worth $1,500.

                     1lbs   2lbs   3lbs   4lbs
Guitar $1500 lb      $1500  $1500  $1500  $1500
Stereo $3000 4lbs    $1500  $1500  $1500  $3000
Laptop               $1500  $1500  $2000  $1500
   $3000 vs ($2000 + $1500)

It’s better to take the laptop + the guitar for $3,500.

formula for calculating call value

   CELL[i][j] = max of 1. the previous max value
                       2. value of current item  + value of remaining space
                       CELL[i-1][j-item-weight]

You can use this formula with every cell in this grid, and you should end up with the same grid I did. Remember how I talked about solving subproblems? You combined the solutions to two subproblems to solve the bigger problem.

Knapsack problem FAQ

What happens if you add an item?

Do you have to recalculate everything to account for this new item? Nope.

dynamic programming keeps progressively building on your estimate.

Question: Would the value of a column ever go down? Is this possible? NO

What happens if you change the order of the rows?

The answer doesn’t change. The order of the rows doesn’t matter

Can you fill in the grid column-wise instead of row-wise?

For this problem, it doesn’t make a difference. It could make a difference for other problems.

What happens if you add a smaller item?

Suppose you can steal a necklace. It weighs 0.5 lb, and it’s worth $1,000.

Because of the necklace, you have to account for finer granularity, so the grid has to change.

Can you steal fractions of an item?

Answer: You can’t. With the dynamic-programming solution, you either take the item or not.

Optimizing your travel itinerary

ATTRACTION                 TIME        RATING

WESTMINISTER ABBEY         1/2 day     7
GLOBAL THEATER             1/2 day     6
NATIONAL GALLERY           1 day       9
BRITISH MUSEUM             2 day       9
ST PAUL'S CATHEDRAL        1/2 day     8
                        1/2      1     1.5   2
WESTMINISTER ABBEY      7w       7w     7w   7w
GLOBAL THEATER          7w       13wg  13wg  13wg
NATIONAL GALLERY        7w       13w   16wn  22wgn
BRITISH MUSEUM          7w       13wg  16wn  22wgn
ST PAUL'S CATHEDRAL     8s       15ws  21wgs 24wgs

Handling items that depend on each other

Dynamic programming only works when each subproblem is discrete—when it doesn’t depend on other subproblems.

That means there’s no way to account for Paris using the dynamic-programming algorithm.

Is it possible that the solution will require more than two sub-knapsacks?

  • you’ll never have more than two sub-knapsacks.
  • sub-knapsacks to have their own sub-knapsacks.

Is it possible that the best solution doesn’t fill the knapsack completely?

Yes. Suppose you could also steal a diamond This is a big diamond: it weighs 3.5 pounds. It’s worth a million dollars,

Longest common substring

HISH -> FISH

formula looks in pseudocode:

if word_a[i] == word_b[j] // The letters match.
   cell[i][j] = cell[i-1][j-1] + 1
else // The letters don’t match.
 cell[i][j] = 0
   H  I  S  H
F  0  0  0  0
I  0  1  0  0
S  0  0  2  0
H  0  0  0  3
Computer scientist sometimes joke about using the Feynman algorithm.

The Feynman algorithm is named after the famous physicist Richard Feynman, and
it works like this:


1. Write down the problem.
2. Think real hard.
3. Write down the solution.

Code example

// Example usage:

function longestCommonSubstring(str1, str2) {
   let str1Len = str1.length;
   let str2Len = str2.length;

   // Making the grid

   const grid = Array.from({ length: str1Len + 1 }, () =>
      Array(str2Len + 1).fill(0),
   );

   let maxLen = 0;
   let endIndex = 0;

   for (let i = 1; i <= str1Len; i++) {
      for (let j = 1; j <= str2Len; j++) {
         if (str1[i - 1] == str2[j - 1]) {
            // Filling in the grid
            grid[i][j] = grid[i - 1][j - 1] + 1;

            if (grid[i][j] > maxLen) {
               maxLen = grid[i][j];
               endIndex = i;
            }
         }
      }
   }

   return str1.slice(endIndex - maxLen, endIndex);
}

const str1 = "hish";
const str2 = "fish";
console.log(longestCommonSubstring(str1, str2)); // Output: "ish

dynamic programming ever really used? Yes:

  1. Biologists use the longest common subsequence to find similarities in DNA strands.

  2. git diff

  3. string similarity. Levenshtein distance measures how similar two strings are, and it uses dynamic programming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment