First create an image sequence from a video with:
ffmpeg -i path/to/video.mp4 -r 30 path/to/output/folder/%06d.png
Where -r
specifies the frequency to save an image (in Hz
, i.e. 30
== 30fps
) and %06d.png
creates a zero-padded filename with 6 zeros.
Next images must be scaled and cropped. For my original case, I need to generate new images based on 512x512 input images, so I will crop a greedy (720x720
) square out of a 1280x720
video in the direct center and then scale to 512x512
. mogrify
, unlike convert
, edits images in place without creating copies :)
mogrify -crop 720x720+280+0 -scale 512x512 path/to/input/folder/*.png
To use the Pix2Pix train.lua
script to generate images from an input image, input images are expected to be concatonated together forming a 2:1
aspect ratio. (I.e, 512x512
image A and 512x512
image B must be combined into a single 1024x512
image AB in order to be used by train.lua
). However, because we are generating (validation, not training/testing) we don't actually have a A's matched pair B (or VV), so we must simply add empty whitepace to the right side of the image before we can feed it into train.lua
with:
mogrify -extent 1024x512+0+0 path/to/input/folder/*.png
Finally, for these images to be used by train.lua
they must be in folder C
inside of folder D
where DATA_ROOT=C
and phase=D
. E.g.
Once you've generated your images with train.lua
they should be output to pix2pix/results/EXPERIMENT_NAME/latest_net_G_val/images/output
. To create an MP4 video of these images run:
ffmpeg -r 30 -f image2 -s 1280x720 -i path/to/input/images/%06d.png -vcodec libx264 -crf 15 -pix_fmt yuv420p output.mp4