Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors

Zhengfei Kuang¹, Tianyuan Zhang², Kai Zhang³, Hao Tan³, Sai Bi³, Yiwei Hu³, Zexiang Xu³,

Milos Hasan³, Gordon Wetzstein¹, Fujun Luan³
¹ Stanford University ² Massachusetts Institute of Technology ³ Adobe Research
CVPR 2025

arXiv

Smooth and Consistent Video Depth and Normal Generation without Annotated Video Data.

(This webpage contains a lot of videos. We suggest using Chrome or Edge for the best experience)

Video Depth Results

We compare our model with DepthCrafter (Trained on Video Dataset) and DepthAnything V2 (Our Backbone Model).

We compare our model with DSINE and Marigold-E2E-FT.