Skip to content

Instantly share code, notes, and snippets.

@sjain07
Last active June 13, 2019 07:10
Show Gist options
  • Save sjain07/7b880ed494467bd44e8c4f051703a011 to your computer and use it in GitHub Desktop.
Save sjain07/7b880ed494467bd44e8c4f051703a011 to your computer and use it in GitHub Desktop.
#this is an Image of size 140x140. We will assume it to be black and white (ie only one channel, it would have been 140x140x3 for rgb)
image = readImage()
#We will break the Image into 7 coloumns and 7 rows and process each of the 49 different parts independently
NoOfCells = 7
#we will try and predict if an image is a dog, cat, cow or wolf. Therfore the number of classes is 4
NoOfClasses = 4
threshold = 0.7
#step will be the size of step to take when moving across the image. Since the image has 7 cells step will be 140/7 = 20
step = height(image)/NoOfCells
#stores the class for each of the 49 cells, each cell will have 4 values which correspond to the probability of a cell being 1 of the 4 classes
#prediction_class_array[i,j] is a vector of size 4 which would look like [0.5 #cat, 0.3 #dog, 0.1 #wolf, 0.2 #cow]
prediction_class_array = new_array(size(NoOfCells,NoOfCells,NoOfClasses))
#stores 2 bounding box suggestions for each of the 49 cells, each cell will have 2 bounding boxes, with each bounding box having x, y, w ,h and c predictions. (x,y) are the coordinates of the center of the box, (w,h) are it's height and width and c is it's confidence
predictions_bounding_box_array = new_array(size(NoOfCells,NoOfCells,NoOfCells,NoOfCells))
#it's a blank array in which we will add the final list of predictions
final_predictions = []
#minimum confidence level we require to make a prediction
threshold = 0.7
for (i<0; i<NoOfCells; i=i+1):
for (j<0; j<NoOfCells;j=j+1):
#we will get each "cell" of size 20x20, 140(image height)/7(no of rows)=20 (step) (size of each cell)"
#each cell will be of size (step, step)
cell = image(i:i+step,j:j+step)
#we will first make a prediction on each cell as to what is the probability of it being one of cat, dog, cow, wolf
#prediction_class_array[i,j] is a vector of size 4 which would look like [0.5 #cat, 0.3 #dog, 0.1 #wolf, 0.2 #cow]
#sum(prediction_class_array[i,j]) = 1
#this gives us our preidction as to what each of the different 49 cells are
#class predictor is a neural network that has 9 convolutional layers that make a final prediction
prediction_class_array[i,j] = class_predictor(cell)
#predictions_bounding_box_array is an array of 2 bounding boxes made for each cell
#size(predictions_bounding_box_array[i,j]) is [2,5]
#predictions_bounding_box_array[i,j,1] is bounding box1, predictions_bounding_box_array[i,j,2] is bounding box 2
#predictions_bounding_box_array[i,j,1] has 5 values for the bounding box [x,y,w,h,c]
#the values are x, y (coordinates of the center of the bounding box) which are whithin the bounding box (values ranging between 0-20 in your case)
#the values are h, w (height and width of the bounding box) they extend outside the cell and are in the range of [0-140]
#the value is c a confidence of overlap with an acutal bounding box that should be predicted
predictions_bounding_box_array[i,j] = bounding_box_predictor(cell)
#predictions_bounding_box_array[i,j,0, 4] is the confidence value for the first bounding box prediction
best_bounding_box = [0 if predictions_bounding_box_array[i,j,0, 4] > predictions_bounding_box_array[i,j,1, 4] else 1]
# we will get the class which has the highest probability, for [0.5 #cat, 0.3 #dog, 0.1 #wolf, 0.2 #cow], 0.5 is the highest probability corresponding to cat which is at position 0. So index_of_max_value will return 0
predicted_class = index_of_max_value(prediction_class_array[i,j])
#we will check if the prediction is above a certain threshold (could be something like 0.7)
if predictions_bounding_box_array[i,j,best_bounding_box, 4] * max_value(prediction_class_array[i,j]) > threshold:
#the prediction is an array which has the x,y coordinate of the box, the height and the width
prediction = [predictions_bounding_box_array[i,j,best_bounding_box, 0:4], predicted_class]
final_predictions.append(prediction)
print final_predictions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment