Daily PublicationTalking Turns: Benchmarking Audio Foundation Models on Turn-Taking DynamicsEvaluation bechmark accepted by ICLR
Daily PublicationWhy Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus AreasSpatial reasoning in VLM using attention visualization
Daily PublicationOPTISHEAR: Towards Efficient and Adaptive Pruning of Large Language Models via Evolutionary OptimizationAdaptive pruning
Daily PublicationReconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion ModelsA good story-telling paper
Daily PublicationKnowledge Bridger: Towards Training-Free Missing Modality CompletionUse graph theory to bridge modalities