This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cd /path/to/demo/data/ | |
# Download train.zip from Kaggle using | |
# https://www.kaggle.com/c/dogs-vs-cats/data | |
ls -l train.zip | |
-rw-r--r--@ 1 user group 543M Jun 12 10:39 train.zip | |
unzip -qq train.zip |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cd /path/to/kaggle/data/ | |
ls train/dogs| head | |
dog.1000.jpg | |
dog.1001.jpg | |
dog.1002.jpg | |
dog.1003.jpg | |
dog.1004.jpg | |
dog.1005.jpg | |
dog.1006.jpg |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cd data | |
rm train validation # symbolic links to _demo dirs. | |
mkdir -p train_orig/dogs train_orig/cats \ | |
validation_orig/dogs validation_orig/cats | |
mv train_orig/dog*[12][345]*.jpg validation_orig/dogs | |
mv train_orig/cat*[12][345]*.jpg validation_orig/cats | |
mv train_orig/dog*.jpg train_orig/dogs | |
mv train_orig/cat*.jpg train_orig/cats |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# 'features' is file path to bcolz array on disk | |
bc = bcolz.open(features)[:] | |
# begin epoch loop | |
while 1: | |
... | |
df = df.sample(frac=1) # shuffle all rows | |
bc = bc[df.index.values] | |
... |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
... | |
i, j = 0, batch_size | |
for _ in range(nbatches): | |
sub = df.iloc[i:j] | |
X2 = bc[i:j] | |
... | |
# Calculate X and Y appropriately | |
... | |
yield [X, X2], Y | |
i = j |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df_mini_batch | |
13 95555756 dog grey /path/to/imgs/756/55/blah_95555756.png | |
5 5467756 cat black /path/to/imgs/756/67/blah_5467756.png | |
1 1161756 cat black /path/to/imgs/756/61/blah_1161756.png | |
7 31255756 cat grey /path/to/imgs/756/55/blah_31255756.png |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
while 1: | |
... | |
df = df.sample(frac=1) # shuffle all rows | |
... | |
i, j = 0, batch_size | |
for _ in range(nbatches): | |
sub = df.iloc[i:j] | |
idx = sub.index.values | |
X2 = bcolz.open(bcolz_dir)[idx] | |
... |
OlderNewer