Featured image of post MR. Video: “MapReduce” is the Principle for Long Video Understanding

MR. Video: “MapReduce” is the Principle for Long Video Understanding

Map for dense short clip perception and Reduce for joint aggregation

Info

Comments

Simple solution. First generate short caption for each clip, then reduce duplicate captions. Map and  reduce

Last updated: 2025-05-03
Built with Hugo, theme modified on Stack