Skip to content

Instantly share code, notes, and snippets.

@Feynman27
Created March 23, 2017 14:32
Show Gist options
  • Save Feynman27/77e94295eb386abec9cc67a6f5ae47e0 to your computer and use it in GitHub Desktop.
Save Feynman27/77e94295eb386abec9cc67a6f5ae47e0 to your computer and use it in GitHub Desktop.
import tensorflow as tf
from keras import backend as K
from keras.layers import Conv2D, MaxPooling2D, Flatten
from keras.layers import Input, LSTM, Embedding, Dense
from keras.models import Model, Sequential
from keras.applications import InceptionV3, VGG19
from keras.layers import TimeDistributed
import numpy as np
def main():
## Define vision model
## Inception (currently doesn't work)
cnn = InceptionV3(weights='imagenet',
include_top='False',
pooling='avg')
# Works
#cnn = VGG19(weights='imagenet',
# include_top='False', pooling='avg')
cnn.trainable = False
H=W=229
C = 3
video_input = Input(shape=(None,H,W,C), name='video_input')
encoded_frame_sequence = TimeDistributed(cnn)(video_input) # the output will be a sequence of vectors
encoded_video = LSTM(256)(encoded_frame_sequence) # the output will be a vector
output = Dense(256, activation='relu')(encoded_video)
video_model = Model(inputs=[video_input], outputs=output)
print(video_model.summary())
video_model.compile(optimizer='adam', loss='mean_squared_error')
video_model.compile(optimizer='adam', loss='mean_squared_error')
#features = np.empty((0,1000))
n_samples = 1
n_frames = 50
frame_sequence = np.random.randint(0.0,255.0,size=(n_samples, n_frames, H,W,C))
y = np.random.random(size=(256,))
y = np.reshape(y,(-1,256))
print(frame_sequence.shape)
video_model.fit(frame_sequence, y, validation_split=0.0,shuffle=False, batch_size=1)
if __name__=='__main__':
main()
@Wazaki-Ou
Copy link

Hey Feynman,

I know this post is quite old but I'm trying to use your code for perform action classification on videos (with vgg as a cnn). However I'm running on some issues using sequences of image data. Do you think you could help me with that?
This is how I read images from my dataset. I made sure each sequence of 6 images represents a specific action.

train_batches = ImageDataGenerator().flow_from_directory(train_path, target_size=(224, 224),
                                                         classes=['Bark', 'Bitting', 'Engage', 'Hidden', 'Jump',
                                                                  'Stand', 'Walk'], batch_size=36, shuffle=False)
valid_batches = ImageDataGenerator().flow_from_directory(valid_path, target_size=(224, 224),
                                                         classes=['Bark', 'Bitting', 'Engage', 'Hidden', 'Jump',
                                                                  'Stand', 'Walk'], batch_size=18, shuffle=False)
test_batches = ImageDataGenerator().flow_from_directory(test_path, target_size=(224, 224),
                                                        classes=['Bark', 'Bitting', 'Engage', 'Hidden', 'Jump',
                                                                 'Stand','Walk'], batch_size=30, shuffle=False)

I would like to know how I can replace the randomly generated nps by this real data. I've been trying for a while but I can't seem to get the shape right. I normally use this to get the next batch:

imgs, labels = next(train_batches)

But then the shape is not conform to the 5d one the model is expecting. I tried reshaping but it does not seem to work. Any idea please?

@milinddeore
Copy link

I am too interested in the same problem. Feynman, can you please share your implementation to train video? Thanks!

@aezco
Copy link

aezco commented Jul 16, 2019

Did you guys solve it? I am also interested for classification with the batches like @Wazaki-Ou

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment