Skip to content

Instantly share code, notes, and snippets.

@m2-farzan
Last active September 26, 2021 08:13
Show Gist options
  • Save m2-farzan/5deb767bc369a406f3b65b9895c0dcae to your computer and use it in GitHub Desktop.
Save m2-farzan/5deb767bc369a406f3b65b9895c0dcae to your computer and use it in GitHub Desktop.
Extract slides from a presentation video by comparing the PSNR of consecutive difference of down-sampled frames against a threshold
#!/bin/bash
set -e
# Usage: ./extract-slides.sh mylecture.mp4
# Config variables
SIMILARITY_THRESHOLD=25 # A slide change is assumed iff frames PSNR similarity falls below this value
FRAME_PERIOD=30 # [Seconds] Duration between sampling times
CONVERT_TO_PDF=0 # Set to 1 to make pdf. Might cause errors if imagemagick is not configured properly. See https://askubuntu.com/questions/1081895/trouble-with-batch-conversion-of-png-to-pdf-using-convert
PDF_FILE="${1%.*}-slides.pdf" # Only used if CONVERT_TO_PDF is 1
OUTPUT_PATH="${1%.*}-slides" # Only used if CONVERT_TO_PDF is 0
FRAMES_TEMP_DIRECTORY="/tmp/extract-slides/frames/${1%.*}"
SLIDES_TEMP_DIRECTORY="/tmp/extract-slides/slides/${1%.*}"
# Get directories ready
mkdir -p "$FRAMES_TEMP_DIRECTORY"
mkdir -p "$SLIDES_TEMP_DIRECTORY"
rm -f "$FRAMES_TEMP_DIRECTORY"/*.jpg
rm -f "$SLIDES_TEMP_DIRECTORY"/*.jpg
# Slice the video into frames
ffmpeg -i "$1" -vf fps=1/$FRAME_PERIOD "$FRAMES_TEMP_DIRECTORY/%04d.jpg"
# Detect slides
last_frame=NONE
current_slide_number=1
IFS=$'\n'
for frame in $(ls "$FRAMES_TEMP_DIRECTORY"/*)
do
current_slide_name=$(printf "%04d.jpg" $current_slide_number)
if [ "$last_frame" == "NONE" ]
then
cp "$frame" "$SLIDES_TEMP_DIRECTORY"/$current_slide_name
else
similarity=$(ffmpeg -i "$frame" -i "$last_frame" -filter_complex "psnr" -f null /dev/null 2>&1 | grep PSNR | grep -Po "average:\K(\d+|inf)")
if [ "$similarity" != "inf" ] && [ "$similarity" -lt "$SIMILARITY_THRESHOLD" ]
then
# New slide
current_slide_number=$(($current_slide_number+1))
current_slide_name=$(printf "%04d.jpg" $current_slide_number)
cp "$frame" "$SLIDES_TEMP_DIRECTORY/$current_slide_name"
else
# Old slide, unchanged or with small updates
cp "$frame" "$SLIDES_TEMP_DIRECTORY/$current_slide_name"
fi
fi
last_frame="$frame"
done
# Convert to pdf
if [ "$CONVERT_TO_PDF" == "1" ]
then
convert $(ls "$SLIDES_TEMP_DIRECTORY"/*) "$PDF_FILE"
else
mkdir -p "$OUTPUT_PATH"
rm -f "$OUTPUT_PATH"/*.jpg
cp "$SLIDES_TEMP_DIRECTORY"/* "$OUTPUT_PATH/"
fi
# Clear temp files
rm -rf "$FRAMES_TEMP_DIRECTORY"
rm -rf "$SLIDES_TEMP_DIRECTORY"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment