
Categories
2025
Φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation

Memory-enhanced Retrieval Augmentation for Long Video Understanding

VACE: Video Tasks within an All-in-one Framework for Creation and Editing

Answer, Refuse, or Guess? Investigating Risk-Aware Decision Making in Language Models

MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems

Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics

Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas

LeCun's talk on Advanced AI
OPTISHEAR: Towards Efficient and Adaptive Pruning of Large Language Models via Evolutionary Optimization

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Knowledge Bridger: Towards Training-Free Missing Modality Completion
