ProGress: Structured Music Generation via Graph Diffusion and Hierarchical Music Analysis

Demo Page
Anonymous Authors
[Paper] | [Code Repo]

Abstract

Artificial Intelligence (AI) for music generation is undergoing rapid developments, with recent symbolic models leveraging sophisticated deep learning and diffusion-model algorithms. One drawback of existing models is that they lack structural cohesion, particularly with respect to harmonic–melodic structure. Furthermore, such existing models are largely “black-box” in nature and are not musically interpretable. This paper addresses these limitations via a novel generative music framework that incorporates concepts of Schenkerian analysis (SchA) in concert with a diffusion modeling framework. This framework, which we call ProGress (Prolongation-enhanced DiGress), adapts state-of-the-art deep models for discrete diffusion (in particular, the DiGress model of Vignac et al., 2023) for interpretable and structured music generation.

Concretely, our contributions include:

Novel adaptations of the DiGress model for music generation;
A novel SchA-inspired phrase fusion methodology; and
A framework allowing users to control various aspects of the generation process to create coherent musical compositions.

Results from human experiments suggest superior performance to existing state-of-the-art methods.

ProGress Survey Examples

A Little Tango

We used our model trained on Bach to generate music with tango rhythms. Simply changing the timbre made some nice results!