Created
March 23, 2017 14:32
-
-
Save Feynman27/77e94295eb386abec9cc67a6f5ae47e0 to your computer and use it in GitHub Desktop.
Minimal example to reproduce https://github.com/fchollet/keras/issues/5934
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import tensorflow as tf | |
| from keras import backend as K | |
| from keras.layers import Conv2D, MaxPooling2D, Flatten | |
| from keras.layers import Input, LSTM, Embedding, Dense | |
| from keras.models import Model, Sequential | |
| from keras.applications import InceptionV3, VGG19 | |
| from keras.layers import TimeDistributed | |
| import numpy as np | |
| def main(): | |
| ## Define vision model | |
| ## Inception (currently doesn't work) | |
| cnn = InceptionV3(weights='imagenet', | |
| include_top='False', | |
| pooling='avg') | |
| # Works | |
| #cnn = VGG19(weights='imagenet', | |
| # include_top='False', pooling='avg') | |
| cnn.trainable = False | |
| H=W=229 | |
| C = 3 | |
| video_input = Input(shape=(None,H,W,C), name='video_input') | |
| encoded_frame_sequence = TimeDistributed(cnn)(video_input) # the output will be a sequence of vectors | |
| encoded_video = LSTM(256)(encoded_frame_sequence) # the output will be a vector | |
| output = Dense(256, activation='relu')(encoded_video) | |
| video_model = Model(inputs=[video_input], outputs=output) | |
| print(video_model.summary()) | |
| video_model.compile(optimizer='adam', loss='mean_squared_error') | |
| video_model.compile(optimizer='adam', loss='mean_squared_error') | |
| #features = np.empty((0,1000)) | |
| n_samples = 1 | |
| n_frames = 50 | |
| frame_sequence = np.random.randint(0.0,255.0,size=(n_samples, n_frames, H,W,C)) | |
| y = np.random.random(size=(256,)) | |
| y = np.reshape(y,(-1,256)) | |
| print(frame_sequence.shape) | |
| video_model.fit(frame_sequence, y, validation_split=0.0,shuffle=False, batch_size=1) | |
| if __name__=='__main__': | |
| main() |
I am too interested in the same problem. Feynman, can you please share your implementation to train video? Thanks!
Did you guys solve it? I am also interested for classification with the batches like @Wazaki-Ou
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey Feynman,
I know this post is quite old but I'm trying to use your code for perform action classification on videos (with vgg as a cnn). However I'm running on some issues using sequences of image data. Do you think you could help me with that?
This is how I read images from my dataset. I made sure each sequence of 6 images represents a specific action.
I would like to know how I can replace the randomly generated nps by this real data. I've been trying for a while but I can't seem to get the shape right. I normally use this to get the next batch:
imgs, labels = next(train_batches)But then the shape is not conform to the 5d one the model is expecting. I tried reshaping but it does not seem to work. Any idea please?