Gate News message, March 30, Alibaba Qianwen announced the launch of the all-modal large model Qwen3.5-Omni. This series includes Instruct versions in three sizes: Plus, Flash, and Light. It supports a 256k long context window. The model supports audio input of more than 10 hours and video-and-audio input of over 400 seconds of 720P (1FPS). The model is natively multimodal pre-trained on large-scale text, vision, and more than 100 million hours of video-and-audio data, demonstrating outstanding all-modal perception and generation capabilities. Compared with the previous-generation Qwen3-Omni, Qwen3.5-Omni has greatly improved multilingual capabilities, enabling speech recognition in 113 languages and dialects and speech generation in 36 languages and dialects.