[2021-11-19] Dr. Yi-Hsuan Yang, Taiwan AI Labs, "Automatic Music Generation with Transformers"

Poster:Post date:2021-09-29
11/19為線上演講,當天不開放R103現場,請各位準時至Cisco Webex平台聽講。

Automatic Music Generation with Transformers
Date: 2021-11-19 2:20pm-3:30pm
Speaker: Dr. Yi-Hsuan Yang, Taiwan AI Labs
Hosted by: Prof. Lung-Pan Cheng


In this talk, I will first give a brief overview of recent deep learning-based approaches for automatic music generation. I will then talk about our own research that employs self-attention based architectures, a.k.a. Transformers, for music generation. A naive approach with Transformers would treat music as a sequence of text-like tokens. But, our research shows that Transformers can generate higher-quality music when music is not treated simply as text. In particular, our "Pop Music Transformer" model, published at ACM Multimedia 2020, employs a novel beat-based representation of music that informs self-attention models with the bar-beat metrical structure present in music. This approach greatly improves the rhythmic structure of the generated music. A more recent model we published at AAAI 2021, named the "Compound Word Transformer", exploits the fact that a musical note is associated with multiple attributes such as pitch, duration and velocity. Instead of predicting tokens corresponding to these different attributes one-by-one at inference time, the Compound Word Transformer predicts them altogether jointly, greatly reducing the sequence length needed to model a full-length song and also making it easier to model the dependency among these attributes.
Dr. Yi-Hsuan Yang is currently with the Taiwan AI Labs as the Chief Music Scientist leading a team working on music AI technologies. He also holds a position as an Associate Research Fellow of the Research Center for IT Innovation, Academia Sinica. He received his Ph.D. degree in Communication Engineering from National Taiwan University in 2010. His main research activity is at the crossroads of music information retrieval and machine learning, and in particular GAN- or Transformer-based automatic music generation in recent years. Dr. Yang was a recipient of the 2011 IEEE Signal Processing Society Young Author Best Paper Award, the 2014 Ta-You Wu Memorial Research Award of the Ministry of Science and Technology, Taiwan, the 2015 Young Scholars’ Creativity Award from the Foundation for the Advancement of Outstanding Scholarship, and the 2019 Multimedia Rising Stars Award from the IEEE International Conference on Multimedia Expo. He is an author of the book Music Emotion Recognition (CRC Press 2011). He was a Technical Program Co-Chair of the International Society for Music Information Retrieval Conference (ISMIR) in 2014. And, he used to serve as an Associate Editor for the IEEE Transactions on Multimedia and IEEE Transactions on Affective Computing, both from 2016 to 2019. Dr. Yang is a senior member of the IEEE. His team developed well-known music AI models such as MidiNet, MuseGAN, and the Pop Music Transformer.  
Last modification time:2021-11-17 PM 4:49

cron web_use_log