This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
############################################################################# | |
# Abstract: This code trains a model to predict next character based on | |
# the previous ones. | |
# | |
# More details: | |
# 1. Each character is represented as a vector of size 256. It is all | |
# zeros except the index where that characters stands in ASCII table. | |
# 2. Code divides the text on chunks of input_size. | |
# 3. The teaching labels of all the characters in a chunk are the next | |
# characters, so we just shift the chuck to the right and assign the |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
##################################################################################### | |
# This code trains a model that predicts which number from 0 to 9 is drawn on the | |
# MNIST pictures of 20x20 pixels. There are 60000 training examples and 10000 | |
# testing examples. | |
# The methods used: Mini-batching, Weight Decay, Momentum, Dropout, | |
# Xavier's Initialization | |
# Number of hidden layers: 3 ( Adjustable ) | |
# Sizes of mini-batches and hidden layer are easily adjustable | |
# Non-Linear functions: Sigmoid, Funny Tanh | |
# Linear: ReLU |