VFX artist heydoughogan has introduced an innovative workflow for ComfyUI that automates the generation of utility passes from video and images, replacing routine rotoscoping with neural network segmentation.

image
image

What Happened

A new workflow has been developed for ComfyUI, combining RMBG models for background removal, SAM3 for text and point-based object segmentation, and specialized tools for face segmentation. The system allows for the automatic generation of depth maps and normals, adapting to the resolution and frame rate of the source material.

Context

Traditional creation of technical layers, such as mattes, depth maps, and normals, requires enormous time investment and manual labor from rotoscoping specialists. Using SOTA segmentation models within a single pipeline allows these tasks to be moved into a semi-automatic mode.

Why It Matters for the Industry

The tool democratizes high-quality compositing, significantly accelerating VFX production pipelines. Automating the preparation of technical layers paves the way for creating specialized AI agents in the visual effects field and could lead to changes in industry standards for asset preparation.

Why It Matters for Users

Videographers, CG artists, and small production studios gain the ability to quickly and cheaply obtain high-quality masks and depth maps for compositing and relighting—tasks that previously required entire teams of specialists or lengthy manual work.

What Is Not Yet Known / Limitations

Industrial implementation in its current form is limited by the lack of an API, the requirement for powerful local computing resources (GPU), and the absence of precise data regarding processing latency.

Sources

Author

Look at AI, Editorial Team