Jin Zhe jin-zhe

THIS IS A WORK IN PROGRESS

An overview of recent action recognition datasets and their detection classes

Action: Atomic low-level movement such as standing up, sitting down, walking, talking etc.
Activity/event: Higher level occurence then actions such as dining, playing, dancing
Trimmed video: A short video clip containing event/action/activity of interest
Untrimmed video: A video clip of arbitrary length potentially containing durations without activities of interest
Localization: locating an instance of event/action/activity within a video at a spatial or temporal scale
Spatial localization: Locating the region/area of an instance of action/activity within a video

	"""
	Intended usage scenario:
	You have a directory of pdfs, each comprising of sequential image scans of
	human-annotated documents (e.g. written questionaries/forms/exams) where every
	document share the same number of pages. Each pdf may contain different
	numbers of such scanned documents. You want to split all these pdfs up into
	smaller pdfs at fixed page index intervals such that each smaller pdf
	correspond to a single scanned document. In addition, you want to place them
	place them under a specific output directory while ensuring no filename
	collisons.