Yun-Han Lan, Wen-Yi Hsiao, Hao-Chung Cheng, Yi-Hsuan Yang, "MusiConGen: Rhythm and chord control for Transformer-based text-to-music generation," in Proc. Int. Society for Music Information Retrieval Conf. (ISMIR), 2024.
We introduce MusiConGen, a Transformer-based text-to-music model that applies temporal conditioning to enhance control over rhythm and chord which is finetuned from the pretrained MusicGen-melody framework.
|
|
The following samples are generated from our proposed MusiConGen model. We use chord and rhythm controls in symbolic representation with 5 genres of text description to generate these samples. The chords, BPMs are labeled data from RWC-pop-100. The chords of generated samples are estimated by BTC chord recognition model as mentioned in paper.
Reference Chords
Generated Sample's Chords
description | A laid-back blues shuffle with a relaxed tempo, warm guitar tones, and a comfortable groove, perfect for a slow dance or a night in. Instruments: electric guitar, bass, drums. | A smooth acid jazz track with a laid-back groove, silky electric piano, and a cool bass, providing a modern take on jazz. Instruments: electric piano, bass, drums. | A classic rock n' roll tune with catchy guitar riffs, driving drums, and a pulsating bass line, reminiscent of the golden era of rock. Instruments: electric guitar, bass, drums. | A high-energy funk tune with slap bass, rhythmic guitar riffs, and a tight horn section, guaranteed to get you grooving. Instruments: bass, guitar, trumpet, saxophone, drums. | A heavy metal onslaught with double kick drum madness, aggressive guitar riffs, and an unrelenting bass, embodying the spirit of metal. Instruments: electric guitar, bass guitar, drums. |
Sample 001 | |||||
Sample 002 | |||||
Sample 003 | |||||
Sample 004 | |||||
Sample 005 |
The following samples are generated from baseline model, finetuned baseline model and our proposed MusiConGen model. We use the same chord and rhythm controls set to compare the performance of finetuning methods.
Reference Chords
Generated Sample's Chords
models | Reference audio | Baseline model | Finetuned baseline model | Our proposed model |
Sample 001 | ||||
Sample 002 | ||||
Sample 003 | ||||
Sample 004 | ||||
Sample 005 |
The template design of this demo page is inspired by Coco-Mulla from Music X Lab.