MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation

Yun-Han Lan, Wen-Yi Hsiao, Hao-Chung Cheng, Yi-Hsuan Yang, "MusiConGen: Rhythm and chord control for Transformer-based text-to-music generation," in Proc. Int. Society for Music Information Retrieval Conf. (ISMIR), 2024.
We introduce MusiConGen, a Transformer-based text-to-music model that applies temporal conditioning to enhance control over rhythm and chord which is finetuned from the pretrained MusicGen-melody framework.

[paper]
[code]

Chord, BPM and text Condition

The following samples are generated from our proposed MusiConGen model. We use chord and rhythm controls in symbolic representation with 5 genres of text description to generate these samples. The chords, BPMs are labeled data from RWC-pop-100. The chords of generated samples are estimated by BTC chord recognition model as mentioned in paper.


Reference Chords

Generated Sample's Chords

Chord
description A laid-back blues shuffle with a relaxed tempo, warm guitar tones, and a comfortable groove, perfect for a slow dance or a night in. Instruments: electric guitar, bass, drums. A smooth acid jazz track with a laid-back groove, silky electric piano, and a cool bass, providing a modern take on jazz. Instruments: electric piano, bass, drums. A classic rock n' roll tune with catchy guitar riffs, driving drums, and a pulsating bass line, reminiscent of the golden era of rock. Instruments: electric guitar, bass, drums. A high-energy funk tune with slap bass, rhythmic guitar riffs, and a tight horn section, guaranteed to get you grooving. Instruments: bass, guitar, trumpet, saxophone, drums. A heavy metal onslaught with double kick drum madness, aggressive guitar riffs, and an unrelenting bass, embodying the spirit of metal. Instruments: electric guitar, bass guitar, drums.
Sample 001
Sample 002
Sample 003
Sample 004
Sample 005

Finetuning mechanism comparison

The following samples are generated from baseline model, finetuned baseline model and our proposed MusiConGen model. We use the same chord and rhythm controls set to compare the performance of finetuning methods.


Reference Chords

Generated Sample's Chords

Chord
models Reference audio Baseline model Finetuned baseline model Our proposed model
Sample 001
Sample 002
Sample 003
Sample 004
Sample 005

Acknowledgement

The template design of this demo page is inspired by Coco-Mulla from Music X Lab.