Benchmarking

Video Transforms

Transforming videos can be more complex, so we provide the following tools for transforming videos.

perception.benchmarking.video_transforms.get_black_frame_padding_transform(duration_s=0, duration_pct=0)

Get a transform that adds black frames at the start and end of a video.

Parameters:
  • duration_s – The duration of the black frames in seconds.

  • duration_pct – The duration of the black frames as a percentage of video duration. If both duration_s and duration_pct are provided, the maximum value is used.

perception.benchmarking.video_transforms.get_simple_transform(width=-1, height=-1, pad=None, codec=None, clip_pct=None, clip_s=None, sar=None, fps=None, output_ext=None)

Resize to a specific size and re-encode.

Parameters:
  • width (str | int) – The target width (-1 to maintain aspect ratio)

  • height (str | int) – The target height (-1 to maintain aspect ratio)

  • pad (str | None) – An ffmpeg pad argument provided as a string.

  • codec (str | None) – The codec for encoding the video.

  • fps – The new frame rate for the video.

  • clip_pct (tuple[float, float] | None) – The video start and end in percentages of video duration.

  • clip_s (tuple[float, float] | None) – The video start and end in seconds (used over clip_pct if both are provided).

  • sar – Whether to make all videos have a common sample aspect ratio (i.e., for all square pixels, set this to ‘1/1’).

  • output_ext – The extension to use when re-encoding (used to select video format). It should include the leading ‘.’.

perception.benchmarking.video_transforms.get_slideshow_transform(frame_input_rate, frame_output_rate, max_frames=None, offset=0)

Get a slideshow transform to create slideshows from videos.

Parameters:
  • frame_input_rate – The rate at which frames will be sampled from the source video (e.g., a rate of 1 means we collect one frame per second of the input video).

  • frame_output_rate – The rate at which the sampled frames are played in the slideshow (e.g., a rate of 0.5 means each frame will appear for 2 seconds).

  • max_frames – The maximum number of frames to write.

  • offset – The number of seconds to wait before beginning the slide show.