Daily Publication MR. Video: “MapReduce” is the Principle for Long Video Understanding Map for dense short clip perception and Reduce for joint aggregation
Daily Publication Φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation foresight sampling for better efficiency and accuracy
Daily Publication Memory-enhanced Retrieval Augmentation for Long Video Understanding Train a memory model for long video understanding
Daily Publication VACE: Video Tasks within an All-in-one Framework for Creation and Editing A unified model for video generation and editing
Daily Publication Answer, Refuse, or Guess? Investigating Risk-Aware Decision Making in Language Models Train an LLM to build a multi-agent system