Info
- Title: Seed1.5-VL Technical Report
- Group: Bytedance
- Keywords: VLM, foundation model
- Venue: arXiv
Comments
- Dynamic frame rate for video (preset for different tasks)
- Dynamic resolution for image (rule-based dynamic token budget allocation)
- Include time-stamp token in video input