Alibaba announces patents related to training video generation models

K-LinePoet · 2026-04-10T07:02:03+00:00

Alibaba recently filed a patent related to video generation methods and model training, aiming to improve the matching between target objects and audio in videos by extracting visual and audio features, thereby enhancing video quality.

K-LinePoet

2026-04-10 07:02:03

Abstract generation in progress

Qichacha APP shows that recently, Alibaba (China) Co., Ltd. applied to publish a patent for “Video Generation Method, Training Method for Video Generation Model, and Task Platform.”

The patent abstract indicates that this embodiment provides a video generation method, a training method for a video generation model, and a task platform, wherein the video generation method includes: obtaining a reference image and reference audio, where the reference image at least contains visual information of the reference object; extracting visual features of the reference object based on the visual information in the reference image, and extracting audio features based on the reference audio; predicting reference action information of the target object under the influence of the audio features based on the interaction characteristics between the visual features and the audio features, where the target object is obtained based on the reference object; generating a video corresponding to the target object based on the reference action information and the reference audio. This method can improve the matching degree between the visual information of the target object in the video and the corresponding audio, thereby enhancing the video presentation effect.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.