Sing2Song:
An Accompaniment Generation System Based on Solo Singing

Choi Sen Ho 1 * , Chap Issac Fung 1 * , Huicheng Zhang 1 * , Yulun Wu 1 * ,
Yueqiao Zhang 1 , Zhili Tan 1 , Qiuqiang Kong 2 , Lui Siu Hang 1 , Yaolong Ju 3 #

1 The Central Media Technology Institute, Huawei, China ,
2 The Chinese University of Hong Kong, Hong Kong SAR, China , 3 Great Bay University, Dongguan, China ,
* Equal Contribution , # Corresponding Author

Project Overview:

In this project, we propose an enhanced pipeline addressing three key challenges:

Our framework acts as a mixing role of improvisational accompanist, composer, and audio engineer, integrating:

This pipeline enables studio-quality musical productions from users' singing while preserving their unique expressive characteristics, representing a significant step toward personalized music generation with no input length limitation.

We demonstrate the effectiveness of our system through qualitative and quantitative evaluations, showing superior alignment, expressiveness, and stylistic consistency compared to state-of-the-art baselines.

Pipeline Diagram

Studio-Quality Singing Input Examples

Our system delivers precise Music Information Retrieval (MIR) on studio-quality vocal recordings and generates professional-grade MIDI accompaniments, ensuring high fidelity and musical coherence.

Our system also supports a wide range of musical styles, from pop and rock to R&B and traditional Chinese, showcasing its versatility and adaptability to different genres.

Vocal Input
Ballad
R&B
Funk
Chinese Traditional
DJ
Title
Artist
Air Traffic
Clara Berry And Wooldog
PunchDruck
Grants
Take a Step
Meaxic
Fire
Night Panther



Vocal Input
Ballad
R&B
Rock
Chinese Traditional
Title
Artist
Vermont
The Districts
Spacestation
Strand Of Oaks
Bounty
Steven Clark
Curfews
Snowmine

Amateur Singing Vocal Input Examples

Our system effectively handles amateur singing inputs with performance variations, generating musically coherent accompaniments that enhance the overall listening experience.
Below are some vocals recorded by amateur singers in real-world smartphone situations, demonstrating the system's robustness and versatility across different singing styles.

Vocal Input
Output
*Random Style
Title
Title in Chinese
Around the Winter
大约在冬季
Tales of the Red Cliff
醉赤壁
Flower in a Mirror
镜中花
Still in Love with You
依然爱你


We also support a wide range of input diversity. From the demo of diversity on amateur recording inputs listed below, you will see not only vocal input, but also hiphop rap input and Chinese traditional instrumental input.

Vocal Input
Ballad
R&B
Funk
DJ
Pure Piano
Title
Title in Chinese
The Rain in Qingming
清明雨上
Pyrus Reblossom
梨花又开放
Babe*Rap
--



Vocal Input
Ballad
R&B
Funk
DJ
Pure Piano
Title
Title in Chinese
Sky
海阔天空
Dizi Solo: Trip to Gusu*Instrumental
笛子独奏:姑苏行
Not a Hero
不谓侠

Visit the GitHub repository for more details about this project.