Skip to content

Instantly share code, notes, and snippets.

@Glidias
Last active July 31, 2025 04:03
Show Gist options
  • Save Glidias/b15b51598ae643bab9dbc7aa12fe62ed to your computer and use it in GitHub Desktop.
Save Glidias/b15b51598ae643bab9dbc7aa12fe62ed to your computer and use it in GitHub Desktop.

IFV2V VACE File-based Task Batch Management Workflow parameters

The purpose of this ComfyUI workflow (Image/Frame(s)/Video To Video process) is to manage batch generations of multiple file items found in a folder with varied mixed image/video file formats (including PSD file format for layers to define frames) through a unified VACE workflow . ((note: this can be swapped out to other types of workflows), but VACE was used because it's most versatile to fit all use cases under the V2V/I2V/FLF2V umbrella)

Required assets:

The source frames asset

This sets up the VACE's control video. Various file formats are supported (see last section below). These files are all located in the main source folder as specified in workflow, so you can switch it to any other folder as when needed.

  • .src. (filename suffix) for image/video/folder of images source asset. Any files with this filename indicator be detected as a batch item to be processed as {stem}.src.{file-extension}.

index.json asset

Define a path to an index.json file as specified in the workflow. This is simply used as a convention to define any global JSON settings for each task item. It is a JSON object containing key value pairs where each key is a {stem} pointing to the value object with all the respective JSON parameters. "{stem}": {...}.

A default fallback {stem} can be defined in the workflow in the event no JSON spec is found in relation to the source file stem, allowing you to set a global fallback default for specific tasks.

Optional assets:

Any {stem} assets are located in the same location as where the respective source assets are located at, which is the main source folder as specified in workflow.

  • {stem}.txt For defining JSON parameters in .txt file alongside the source asset file for the respective item rather than use the global index.json stem definitions.
    • NOTE: You also turn on "Enable Global TXT search" in the workflow (and specify a root folder path), to avoid the restriction of always requiring the {stem}.txt file to be always be in the same location as the source asset file. This allows you to define TXT json files in seperate folder locations. With this option enabled, it will recursively search that root folder for .txt files as a final global fallback with \{stem}.txt or /{stem}.txt (you can customise the slash depending on your OS). In the event multiple matches are found though, it will pick the first matching one, but the list order is undetermined (ie. based off your OS settings) unless you customise the workflow yourself to sort the list. It will pick the first match in the OS walker for the file system, so it will always pick the closest depth-first search match which can be considered predictable enough for most users.

Masking conventions:

You can define the prefered default pseudo mask flatcolor (a single grayscale value) within the workflow itself. White is used but some might prefer or need a mid-gray tone (like 128, or 127, etc.).. I'm not too sure which is best or really does it make a difference or not in some cases. (TODO: json parameter per item for this)

Intrinsic masking (from source images):

Any transparent regions in alpha channel (if any) for source video/image frames will always be treated as an enforced foreground mask with default pseudo-mask flat colour that will always be on top for control video. Such alpha transparent areas will be merged into VACE's control mask for inpainting.

Control Masking:

(Optional) Masked areas under listed assets will potentially apply controlnet layer coloring for control video depending on controlnet settings. Resulting mask will be merged into VACE's control mask for inpainting.

  • {stem}.mask. (filename suffix) for image/video/folder of images masking asset.

Extra masking:

(Optional) Extra masking that lies behind the Control Masking (if any) that is painted with default pseudo mask colour for control video. (good for erasing areas especially backgrounds or defining blanked-out external masks without editing the source image files). This ignores any controlnet settings by default. Resulting mask will be merged into VACE's control mask for inpainting.

  • "extramask":(Optional) (JSON parameter) extra masking asset absolute file/folder path.

Other JSON optional parameters per item:

  • "positive":: The positive prompt to use/add to the existing workflow.
  • "negative":: The negative prompt to prepend to the existing workflow. (Typically the WAN defaullts and any custom defaults of your own can be added on your end for the workflow itself).
  • "reference_image": (Optional) Absolute file path to image file asset for VACE reference image. Leave it as a blank string "" to automatically use the first frame of the source asset as the VACE reference image. Leave the field out to ensure no reference image is used.
  • Cnet (controlnet) settings: The workflow has a default cnet setting by default for region-based masking with mask assets. But you can override this with a cnet parameter.
    • "cnet": 0 Do not use any controlnet presets. Always replace any masked areas with pseudo-mask flat color instead.
    • "cnet": >= 1 (Use controlnet preset N for masked areas or entire masked control frames) See the workflow's control masking group node switch for options or customise it accordingly. If this value is explicitly specified in JSON parameters, but no extramsk or mask assets are found/specified, then it is assumed controlnet coloring presets are to be applied across the entire set of source frames, and the entire set of source frames will be masked for restyling replacement.
    • "cnet": -1 (Do not replace any masked areas with controlnet presets or process them for output. Assumed source video/frames is already baked with the intended controlnet colorings. If this value is explicitly specified in JSON parameters but no extramask or mask assets are found/specified, it will also be assumed that the entire set of source frames will be masked for restyling replacement.
    • For V2V extension of video, make sure you leave "cnet" JSON setting undefined (ie. unspecified) or zero to ensure the original source video remains untouched even when no mask assets are specified/found for the VACE control mask.
    • "cnet_replace_black_bg": (TODO/CONSIDER:) For masking cnet portions with specific assets, set this to false to avoid replacing any fully black portions with the default pseudo-mask flat color instead. By default, the workflow has this set to true, but only if mask or extra amask assets are being used. Consider replace default with false instead or remove this setting entirely if possible. Which default is better? Does black portions in semi-masked inpainting areas with controlnet areas have issues in potenttially unintentionally rendered rather than getting replaced? Issue is that some cnet colorings like Canny edges uses white/gray over black and don't have this applied.)
  • "fps": Forced fps setting for loaded videos, whether it be source/mask/control videos. If left undefined, uses the workflow default.
  • IF2V Frame length control: You need to ensure total frame length should be ((divisible by 4) + 1).
    • "frames": (Optional) Keyframe spec settings as an array list of string-literal numbers, to define how many more extra blank frames to add after each given source keyframe. Including this spec indicates that some form of I2V/FL2V (IF2V) process is intended with potentially multiple keyframes.
    • eg. I2V: ["80"] for 80 extra blank frames after the first keyframe for a total of 81 frames.
    • eg. FL2V: ["79", "0"] for 79 extra blank frames after the first keyframe and zero extra frames after the last keyframe for a total of 81 frames.
    • eg. IF2V: ["63", "32"] for 63 extra blank frames after the first keyframe and 32 extra blank frames after the last keyframe for a total of 97 frames.
  • V2V Frame length control: You need to ensure total resulting video frame length after these adjustments should be ((divisible by 4) + 1).
    • "load_cap": If no masks/extra mask assets are found, this will allow capping off the frame count by a non-zero amount for the source video. Negative values are supported to truncate relative from end of image frame sequence. Useful for V2V cutting of undesirable video portions, alongside with video extension.
    • "skip_index": If no masks/extra mask assets are found, this will allow skipping (shifting away) the initial length of source video frames from the start . Negative values are supported to start skipping the start source index relative to the end of the image frame sequence. Useful for V2V video extension by omitting out intial starting frames.
    • "extra_control_video": Absolute path to extra control video file asset (it is assumed all controlnet colors are already baked into this video, and no further processing is required). Use this to append extra control video frames after the initial set of source control frames.
    • "extra_frames": Use this for V2V cases for extending the entire video sequence of frames with extra amount blank frames according to the stipulated amount after the generated source control frames. (eg. "extra_frames": 64). This will be ignored if "frames": is being used, but will still be used as extra blank frames to add after "extra_control_video" (if found).
      • This can also be used to also set up T2V video generation cases of varying lengths by using a single blank transparent webp/png image as a starting blank source image frame along with a extra_frames count that is a multiple of 4.
    • "extra_load_cap": If loading extra control video via "extra_control_video", this will allow capping off the frame count by a non-zero amount for the extra control video. Negative values are supported to allow for truncating relative from the end of video frame sequence.
    • "extra_skip_index": If loading extra control video via "extra_control_video", this will allow skipping (shifting away) the initial length of frames from the start for the extra control video. Negative values are supported to start skipping from the start source index relative from the end of the video frame sequence.

Outpainting JSON settings:

TODO/CONSIDER: But currently, this isn't very essential since manually expanding the source images/videos via seperate workflows/programs would suffice. Having this is good possibilty to have though.

File extension formats' conventions for assets:

  • psd: Photoshop file format
    • each visible layer treated as a single source frame arranged from bottom up order.
    • (optional) each layer named 'mask', which is hidden versus the visible layers, will be treated as a respective Control Masking/.mask. image frame arranged from bottom up order.
    • (optional) each layer named 'extramask', which is hidden versus the visible layers, will be treated as a respective Extra masking/"extramask": image frame arranged from bottom up order.
    • If the first visible layer has a numeric based Layer Name, (eg. 80), then it is assumed all visible layers adopt numeric-based layer names to define the respective "frames": keyframes setting. This allows you to define a IF2V/FLF2V/I2V frame durations completely within the single PSD file itself, which can be quite convenient for keeping track instead of using external json.
  • webp|png|jpg|jpeg: Image file formats
    • To define image-based frames. Alpha transparency channel in image frames are converted to mask directly. (ie. the transparent regions imply masking areas for inpainting => inverted alpha channel values for mask). When used as a mask asset under .mask. file or extramask: JSON-specified file path, the colored channels are for personal reference only and are ignored by the batching engine. Masking with jpg/jpeg not supported here due to image-based frame conventions requiring alpha channel instead.
  • Folder of images (ie. no file extension) : Folder containing image-based frames, will be arranged in alphabetical order
  • webm|mp4|avi|mov|gif|apng: Video file formats
    • To define frames in a video container. When used as a mask asset under .mask. file or extramask: JSON-specified file path, use a black and white lossless video always (since there is no alpha channel typically for most video formats), so that the black and white colors in the video frames can be converted to mask directly. (ie. the white regions imply masking areas for inpainting and black regions imply areas that won't be masked for inpainting).
@Glidias
Copy link
Author

Glidias commented Jul 30, 2025

examples: advanced mixed images/video controlnet hybrid composition examples

Reference image with pre-baked Controlnet motion control (single image source )

  • "reference_image": "", "extra_control_video": "...", skip_index: 1`
  • may add "extra_frames" to freely extend video

Reference image with source Controlnet motion control (single video source)

  • "reference_image": "...specify actual path to image....". Use "cnet": -1 if source video already pre-baked with controlnet colors, otherwise, explicitly specify a positive "cnet" value to ensure one of the controlnet coloring presets are used for source video.
  • may add "extra_frames" to freely extend video

Starting image frame with pre-baked Controlnet motion control over other frames (single image/video source)

  • "extra_control_video": "...", "
  • may use extra_skip_index of 1 or more if needed depending on situation, and respective "extra_load_cap": -1 if the loaded video is actually generated from initial starting image frame and wish to regenerate over latter frames recyling previously generated motion.
  • Use "skip_index": 1 and "load_cap": 1, if loading first frame from video source.
  • may also use "reference_image": "" if want to re-emphasise first frame as reference image
  • may add "extra_frames" to freely extend video

Starting video with pre-baked Controlnet motion control over other frames (single video source)

  • "extra_control_video": "...", "
  • may also use "reference_image": "" if want to re-emphasise first frame as reference image
  • may add "extra_frames" to freely extend video

IF2V keyframes with pre-baked Controlnet motion control video extension (single set of images source)

  • "frames: ["31", "0"], "extra_control_video": "..."
  • may add "extra_frames" to freely extend video

IF2V keyframes with source Controlnet restyling for whole keyframes (single set of images source)

  • "frames: ["79", "0"], "cnet": 2 (ie. your prefered >=1 cnet preset)
  • Instead of zero extra frames at the end, may use a positive value to freely extend video length by that amount.

For actual mixed images/video compositions in source frames, this is not supported, and you may be better off arranging your own video animation sequence entirely with transparency channel and duplicated frames kept (eg. setting up an animated PNG sequence). Krita is quite good at this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment