Bluesky Facebook Reddit Email

AI system turns a song into a complete music video

02.02.26 | Queen Mary University of London

Apple iPhone 17 Pro

Apple iPhone 17 Pro delivers top performance and advanced cameras for field documentation, data collection, and secure research communications.


Auto MV, a collaboration led by researchers at Queen Mary University of London, is the first open-source AI system capable of generating complete music videos directly from full-length songs.

To date, generative AI has struggled to create video based on music. While recent video models can produce visually impressive short clips, they tend to be less successful with long-form storytelling, musical alignment and character consistency.

Now, a new system is using songs as the basis for generating complete music videos .

AutoMV, led by Yinghao Ma, PhD student at Queen Mary University of London, under the supervision of Dr Emmanouil Benetos, has found a way to address these shortcomings. It uses a multi-agent AI system designed specifically for full-length music video production.

AutoMV works like a virtual film production team. First, it analyses a song’s musical structure, beats, and time-aligned lyrics. Then, a set of specialised AI agents — taking on roles such as screenwriter, director, and editor — collaborate to plan scenes, maintain character identity, and generate images and video clips. A final quality-control “verifier” agent checks for coherence and consistency, regenerating content where needed.

This approach allows AutoMV to produce music videos that follow a song from beginning to end, maintaining narrative flow and visual identity throughout. Human expert evaluations show that AutoMV significantly outperforms AI video generation tools already on the market. commercial tools, narrowing the gap between AI-generated videos and professionally produced music videos.

By lowering the cost of music video production from tens of thousands of pounds to roughly the cost of an API call (or digital instruction), AutoMV has the potential to empower independent musicians, educators, and creators who previously lacked access to professional video production. As an open-source project, it also supports transparent, reproducible research and encourages community collaboration.

Yinghao Ma, a PhD student in the Centre for Digital Music at Queen Mary who was recently awarded a Google fellowship, said: "Producing a full music video that follows a whole song has been difficult for AI systems. With AutoMV, we demonstrate that this can be done in a structured and coherent way. I am particularly pleased that this work makes music video creation more accessible to independent artists and enables them to share their work on YouTube.”

Developed through a collaboration between Queen Mary researchers and the university’s collaborators of nearly 20 years at Beijing University of Posts and Telecommunications, as well as partners at Nanjing University, Hong Kong University of Science and Technology, and the University of Manchester, AutoMV brings together expertise in music information retrieval, multimodal AI, and creative computing. The work was led by Queen Mary’s Yinghao Ma, with supervision from Dr Emmanouil Benetos , Dr. Changjae Oh and Chaoran Zhu from the university’s Centre for Intelligent Sensing .

The team is actively inviting researchers and students to contribute to the codebase, extend the benchmark, and explore future directions for long-form, multimodal AI systems.

Keywords

Contact Information

Katy Taylor-Gooby
Queen Mary University of London
k.taylor-gooby@qmul.ac.uk

How to Cite This Article

APA:
Queen Mary University of London. (2026, February 2). AI system turns a song into a complete music video. Brightsurf News. https://www.brightsurf.com/news/1GRM49W8/ai-system-turns-a-song-into-a-complete-music-video.html
MLA:
"AI system turns a song into a complete music video." Brightsurf News, Feb. 2 2026, https://www.brightsurf.com/news/1GRM49W8/ai-system-turns-a-song-into-a-complete-music-video.html.