Last active
June 13, 2019 07:10
-
-
Save sjain07/7b880ed494467bd44e8c4f051703a011 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#this is an Image of size 140x140. We will assume it to be black and white (ie only one channel, it would have been 140x140x3 for rgb) | |
image = readImage() | |
#We will break the Image into 7 coloumns and 7 rows and process each of the 49 different parts independently | |
NoOfCells = 7 | |
#we will try and predict if an image is a dog, cat, cow or wolf. Therfore the number of classes is 4 | |
NoOfClasses = 4 | |
threshold = 0.7 | |
#step will be the size of step to take when moving across the image. Since the image has 7 cells step will be 140/7 = 20 | |
step = height(image)/NoOfCells | |
#stores the class for each of the 49 cells, each cell will have 4 values which correspond to the probability of a cell being 1 of the 4 classes | |
#prediction_class_array[i,j] is a vector of size 4 which would look like [0.5 #cat, 0.3 #dog, 0.1 #wolf, 0.2 #cow] | |
prediction_class_array = new_array(size(NoOfCells,NoOfCells,NoOfClasses)) | |
#stores 2 bounding box suggestions for each of the 49 cells, each cell will have 2 bounding boxes, with each bounding box having x, y, w ,h and c predictions. (x,y) are the coordinates of the center of the box, (w,h) are it's height and width and c is it's confidence | |
predictions_bounding_box_array = new_array(size(NoOfCells,NoOfCells,NoOfCells,NoOfCells)) | |
#it's a blank array in which we will add the final list of predictions | |
final_predictions = [] | |
#minimum confidence level we require to make a prediction | |
threshold = 0.7 | |
for (i<0; i<NoOfCells; i=i+1): | |
for (j<0; j<NoOfCells;j=j+1): | |
#we will get each "cell" of size 20x20, 140(image height)/7(no of rows)=20 (step) (size of each cell)" | |
#each cell will be of size (step, step) | |
cell = image(i:i+step,j:j+step) | |
#we will first make a prediction on each cell as to what is the probability of it being one of cat, dog, cow, wolf | |
#prediction_class_array[i,j] is a vector of size 4 which would look like [0.5 #cat, 0.3 #dog, 0.1 #wolf, 0.2 #cow] | |
#sum(prediction_class_array[i,j]) = 1 | |
#this gives us our preidction as to what each of the different 49 cells are | |
#class predictor is a neural network that has 9 convolutional layers that make a final prediction | |
prediction_class_array[i,j] = class_predictor(cell) | |
#predictions_bounding_box_array is an array of 2 bounding boxes made for each cell | |
#size(predictions_bounding_box_array[i,j]) is [2,5] | |
#predictions_bounding_box_array[i,j,1] is bounding box1, predictions_bounding_box_array[i,j,2] is bounding box 2 | |
#predictions_bounding_box_array[i,j,1] has 5 values for the bounding box [x,y,w,h,c] | |
#the values are x, y (coordinates of the center of the bounding box) which are whithin the bounding box (values ranging between 0-20 in your case) | |
#the values are h, w (height and width of the bounding box) they extend outside the cell and are in the range of [0-140] | |
#the value is c a confidence of overlap with an acutal bounding box that should be predicted | |
predictions_bounding_box_array[i,j] = bounding_box_predictor(cell) | |
#predictions_bounding_box_array[i,j,0, 4] is the confidence value for the first bounding box prediction | |
best_bounding_box = [0 if predictions_bounding_box_array[i,j,0, 4] > predictions_bounding_box_array[i,j,1, 4] else 1] | |
# we will get the class which has the highest probability, for [0.5 #cat, 0.3 #dog, 0.1 #wolf, 0.2 #cow], 0.5 is the highest probability corresponding to cat which is at position 0. So index_of_max_value will return 0 | |
predicted_class = index_of_max_value(prediction_class_array[i,j]) | |
#we will check if the prediction is above a certain threshold (could be something like 0.7) | |
if predictions_bounding_box_array[i,j,best_bounding_box, 4] * max_value(prediction_class_array[i,j]) > threshold: | |
#the prediction is an array which has the x,y coordinate of the box, the height and the width | |
prediction = [predictions_bounding_box_array[i,j,best_bounding_box, 0:4], predicted_class] | |
final_predictions.append(prediction) | |
print final_predictions |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment