SlideShare a Scribd company logo
ALGORITHMIC
MUSIC
GENERATION
PadmajaV Bhagwat
3rd year, B.Tech, IT, NITK Surathkal. 01
25th Sept, 2016PyCon India 2016
Unnati Data
Labswww.unnati.xyz
Acknowledgement:
My team members
1. Chandana NT
2. Anirudh Sriram
3. Subramaniam Thirunavakkarasu
Under the guidance of
1. Nischal HP
2. Bargava Subramanian
3. Amit Kapoor
4. Raghotam Sripadraj 02
What are
Artificial
Neural
Networks?
 It is biologically-inspired programming paradigm
which enables a computer to learn from
observational data.
 It resembles human brains mainly these two ways
1. A neural network acquires knowledge through
learning.
2. A neural network’s knowledge is stored within
the interconnection strengths known as synaptic
weight.
03
sample #2
sample #1
Using AI to
generate
Music.
Can you
tell the
difference?
04
How to
make
computer to
generate
music?
Step 1: Converting Mp3 files to np-tensors
Step 2:Traing the model
Step 3: Generating the music
05
Uncompress
the music:
mp3 to
monaural
WAV using
LAME cmd = 'lame -a -m m {0} {1}'.format(
quote(filename), quote(filename_tmp))
os.system(cmd)
06
Converting
to Numpy
array.
These are divided into blocks of equal size
by zero padding it.
07
Convert
from time to
frequency
domain
Convert from the time domain to its
corresponding frequency domain using a
Discrete Fourier Transform.
08
Can we
use
traditional
RNN?
Drawback: They cannot retain information for
long periods of time.
RNNs have loops.
 Allows persistence of information.
 Each neuron accepts the input from
previous layer as well as from previous
neuron in the same layer.
09
Can we
make it
remember
for longer
period of
time?
Yes we can! By using
LSTM networks.
10
What are
LSTM
Networks?
Long Short Term Memory
Special type of RNN.
Capable of learning long term dependencies.
It is well-suited to learn from experience to
classify, process and predict time series when
there are very long time lags of unknown size
between important events.
Explicitly designed to avoid the long-term
dependency problem
11
Let’s
understand
its
architecture.
12
This is how it looks!
A separate vector called
cell state is dedicated to
remember information.
Advantage: Number of
parameters that it needs to
learn is less compared to
traditional networks.
13
Step 1:  The first step in LSTM is to decide what
information we’re going to throw away from the cell
state.
 This decision is made by a sigmoid layer called the
“forget gate layer”.
14
Step 2:  Decide what new information we’re going to store
in the cell state.
 This has two parts:
1. A sigmoid layer called the “input gate layer”
decides which values we’ll update.
2. A tanh layer creates a vector of new candidate
values, Ct that could be added to the state
15
Step 3:
we update the old cell state Ct-1 to the new cell state Ct
16
Step 4:  we need to decide what we’re going to output.
 Sigmoid layer decides what parts of the cell states
we are going to output.
 The updated cell state vector is passed through a
tanh layer and multiply it by the output of the
sigmoid gate.
17
Training
the
model:
 Input data is given to the input of the network,
which then fires all the corresponding neurons.
 The LSTM generates a sequence of notes which is
compared against the expected output and the
errors are back-propagated, thus adjusting the
parameters learnt by the LSTM.
After shifting
18
The vector used for computing loss function is same as the input layers but shifted by 1 block.
Training
the
model:
 The optimizer used is 'rms-prop' and the loss
function used is 'mean squared error‘.
 The learnt parameters are stored in a file, that is
further used in generation phase.
19
Model =
network_utils.create_lstm_network(num_freque
ncy_dimensions=freq_space_dims,
num_hidden_dimensions=hidden_dims)
#hidden_dims=1024
Generation
phase:
 Step 1 - Given A = [X0, X1, ... Xn], generate Xn + 1
 Step 2 - Append Xn + 1 to A.
 Step 3 - Repeat Steps 1 and 2 again and again depending on
the length of the music piece user wish to generate.
#We take first chunk of the training data as a seed sequence
seed_len = 1
seed_seq = seed_generator.generate_copy_seed_sequence(seed_length=seed_len, training_data=X_train)
#This defines how long the final song is. Total song length in samples = max_seq_len * example_len
max_seq_len = 10
output = sequence_generator.generate_from_seed(model=model, seed=seed_seq,
sequence_length=max_seq_len, data_variance=X_var, data_mean=X_mean)
20
Challenges!  The music that is generated has to be unique without
infringing the copyrights.
 It has to sound good, and what sounds good is very
subjective.
 Unlike traditional math problem, this cannot be solved
by set of formulae.
 It takes lot of time to train the model in order to make it
generate good music.
21
How long
does it
take?
Well it requires time…
 After about 100 iterations over 20
different music files.
 After 2000 iterations over 20
different music files.
22
Python
Libraries
used:
• Keras version 0.1.0 with Theano as the backend.
• NumPy and SciPy for various mathematical
computation on tensors.
• Matplotlib for visualizing the input.
• LAME and SoX to convert mp3 files into other
formats such as wav.
23
The entire code is on GitHub.
https://github.com/unnati-xyz/music-generation
24
Key notes
for using
this code to
generate
your own
music:
Step 1: Converting the given mp3 files into np
tensors.
python convert_directory.py // creates
YourMusicLibraryNP_x.npy, YourMusicLibraryNP_y.npy
Step 2: Training the model.
python train.py // creates LSTM model
Step 3: Generating the music.
python generate.py // final generated music which is stored
in a file named generated_song.wav.
25
References:
 http://colah.github.io/posts/2015-08-Understanding-LSTMs/
 http://karpathy.github.io/2015/05/21/rnn-effectiveness/
 https://cs224d.stanford.edu/reports/NayebiAran.pdf
 https://www.quora.com/What-is-machine-learning-in-
laymans-terms-1
 http://www.hexahedria.com/2015/08/03/composing-music-
with-recurrent-neural-networks/
Fun at UNNATI! 
THANK YOU! 
CC-Attribution-ShareAlike
https://creativecommons.org/licenses/by-sa/4.0/
www.linkedin.com/in/padmajavb

More Related Content

PPTX
Pantaloons marketing mix
PDF
Intelligence Artificielle: Vers l'ère de l'imagination
PPTX
Pigeonhole Principle - Seminar In Problem Solving In Mathematics
PPTX
Pantaloons ppt
PPT
Principles of economics (Chapter 1)
PDF
Recurrence relations
PPTX
Base de Données Chapitre I .pptx
PPSX
Cloud computing
Pantaloons marketing mix
Intelligence Artificielle: Vers l'ère de l'imagination
Pigeonhole Principle - Seminar In Problem Solving In Mathematics
Pantaloons ppt
Principles of economics (Chapter 1)
Recurrence relations
Base de Données Chapitre I .pptx
Cloud computing

What's hot (20)

PPTX
PDF
Machine learning for Music
PDF
PPTX
SAMPLING & RECONSTRUCTION OF DISCRETE TIME SIGNAL
PDF
Wt module 2
PPTX
Basic Learning Algorithms of ANN
PDF
05. Frequency Management and Channel Assignment.pdf
PPT
rnn BASICS
PPTX
Understanding Autoencoder (Deep Learning Book, Chapter 14)
PDF
Recurrent neural networks rnn
PDF
Long Short Term Memory
PDF
Artificial Neural Networks Lect3: Neural Network Learning rules
PPTX
Introduction to deep learning
PDF
Wireless traffic theory and handoff
PPTX
A brief introduction to mutual information and its application
PPTX
Parallel processing (simd and mimd)
PPTX
Autoencoders in Deep Learning
PDF
Autoencoders
PPTX
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
PPTX
Machine learning for Music
SAMPLING & RECONSTRUCTION OF DISCRETE TIME SIGNAL
Wt module 2
Basic Learning Algorithms of ANN
05. Frequency Management and Channel Assignment.pdf
rnn BASICS
Understanding Autoencoder (Deep Learning Book, Chapter 14)
Recurrent neural networks rnn
Long Short Term Memory
Artificial Neural Networks Lect3: Neural Network Learning rules
Introduction to deep learning
Wireless traffic theory and handoff
A brief introduction to mutual information and its application
Parallel processing (simd and mimd)
Autoencoders in Deep Learning
Autoencoders
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Ad

Similar to Algorithmic music generation (20)

PDF
Lab 6 Neural Network
DOCX
Plant recognition system
PPTX
Artifical Neural Network
PDF
Deep learning
PPTX
Algorithmic Music Generation
PDF
5th_sem_presentationtoday.pdf
DOCX
Som paper1.doc
PPTX
Machine learning_ Replicating Human Brain
PDF
Lecture7_Neural Networks_and_analysis2024.pdf
PDF
Artificial Neural networks
PPS
Neural Networks
PDF
IRJET - Music Generation using Deep Learning
PPTX
Neural Networks For Secondary Structure.pptx
PPT
Artificial neural network
PDF
Open CV Implementation of Object Recognition Using Artificial Neural Networks
PPT
lecture11_Artificial neural networks.ppt
PPTX
ppt document 5b6 presentation based on education.pptx
PPTX
Basics of Artificial Neural Network.pptx
PPTX
Deep learning - Chatbot
PPTX
Deep learning tutorial 9/2019
Lab 6 Neural Network
Plant recognition system
Artifical Neural Network
Deep learning
Algorithmic Music Generation
5th_sem_presentationtoday.pdf
Som paper1.doc
Machine learning_ Replicating Human Brain
Lecture7_Neural Networks_and_analysis2024.pdf
Artificial Neural networks
Neural Networks
IRJET - Music Generation using Deep Learning
Neural Networks For Secondary Structure.pptx
Artificial neural network
Open CV Implementation of Object Recognition Using Artificial Neural Networks
lecture11_Artificial neural networks.ppt
ppt document 5b6 presentation based on education.pptx
Basics of Artificial Neural Network.pptx
Deep learning - Chatbot
Deep learning tutorial 9/2019
Ad

Recently uploaded (20)

PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Computer network topology notes for revision
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
1_Introduction to advance data techniques.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Mega Projects Data Mega Projects Data
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPT
Quality review (1)_presentation of this 21
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Business Acumen Training GuidePresentation.pptx
Fluorescence-microscope_Botany_detailed content
Reliability_Chapter_ presentation 1221.5784
Introduction-to-Cloud-ComputingFinal.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Computer network topology notes for revision
Galatica Smart Energy Infrastructure Startup Pitch Deck
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
1_Introduction to advance data techniques.pptx
Business Analytics and business intelligence.pdf
Supervised vs unsupervised machine learning algorithms
Mega Projects Data Mega Projects Data
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Quality review (1)_presentation of this 21

Algorithmic music generation

  • 1. ALGORITHMIC MUSIC GENERATION PadmajaV Bhagwat 3rd year, B.Tech, IT, NITK Surathkal. 01 25th Sept, 2016PyCon India 2016
  • 2. Unnati Data Labswww.unnati.xyz Acknowledgement: My team members 1. Chandana NT 2. Anirudh Sriram 3. Subramaniam Thirunavakkarasu Under the guidance of 1. Nischal HP 2. Bargava Subramanian 3. Amit Kapoor 4. Raghotam Sripadraj 02
  • 3. What are Artificial Neural Networks?  It is biologically-inspired programming paradigm which enables a computer to learn from observational data.  It resembles human brains mainly these two ways 1. A neural network acquires knowledge through learning. 2. A neural network’s knowledge is stored within the interconnection strengths known as synaptic weight. 03
  • 4. sample #2 sample #1 Using AI to generate Music. Can you tell the difference? 04
  • 5. How to make computer to generate music? Step 1: Converting Mp3 files to np-tensors Step 2:Traing the model Step 3: Generating the music 05
  • 6. Uncompress the music: mp3 to monaural WAV using LAME cmd = 'lame -a -m m {0} {1}'.format( quote(filename), quote(filename_tmp)) os.system(cmd) 06
  • 7. Converting to Numpy array. These are divided into blocks of equal size by zero padding it. 07
  • 8. Convert from time to frequency domain Convert from the time domain to its corresponding frequency domain using a Discrete Fourier Transform. 08
  • 9. Can we use traditional RNN? Drawback: They cannot retain information for long periods of time. RNNs have loops.  Allows persistence of information.  Each neuron accepts the input from previous layer as well as from previous neuron in the same layer. 09
  • 10. Can we make it remember for longer period of time? Yes we can! By using LSTM networks. 10
  • 11. What are LSTM Networks? Long Short Term Memory Special type of RNN. Capable of learning long term dependencies. It is well-suited to learn from experience to classify, process and predict time series when there are very long time lags of unknown size between important events. Explicitly designed to avoid the long-term dependency problem 11
  • 13. This is how it looks! A separate vector called cell state is dedicated to remember information. Advantage: Number of parameters that it needs to learn is less compared to traditional networks. 13
  • 14. Step 1:  The first step in LSTM is to decide what information we’re going to throw away from the cell state.  This decision is made by a sigmoid layer called the “forget gate layer”. 14
  • 15. Step 2:  Decide what new information we’re going to store in the cell state.  This has two parts: 1. A sigmoid layer called the “input gate layer” decides which values we’ll update. 2. A tanh layer creates a vector of new candidate values, Ct that could be added to the state 15
  • 16. Step 3: we update the old cell state Ct-1 to the new cell state Ct 16
  • 17. Step 4:  we need to decide what we’re going to output.  Sigmoid layer decides what parts of the cell states we are going to output.  The updated cell state vector is passed through a tanh layer and multiply it by the output of the sigmoid gate. 17
  • 18. Training the model:  Input data is given to the input of the network, which then fires all the corresponding neurons.  The LSTM generates a sequence of notes which is compared against the expected output and the errors are back-propagated, thus adjusting the parameters learnt by the LSTM. After shifting 18 The vector used for computing loss function is same as the input layers but shifted by 1 block.
  • 19. Training the model:  The optimizer used is 'rms-prop' and the loss function used is 'mean squared error‘.  The learnt parameters are stored in a file, that is further used in generation phase. 19 Model = network_utils.create_lstm_network(num_freque ncy_dimensions=freq_space_dims, num_hidden_dimensions=hidden_dims) #hidden_dims=1024
  • 20. Generation phase:  Step 1 - Given A = [X0, X1, ... Xn], generate Xn + 1  Step 2 - Append Xn + 1 to A.  Step 3 - Repeat Steps 1 and 2 again and again depending on the length of the music piece user wish to generate. #We take first chunk of the training data as a seed sequence seed_len = 1 seed_seq = seed_generator.generate_copy_seed_sequence(seed_length=seed_len, training_data=X_train) #This defines how long the final song is. Total song length in samples = max_seq_len * example_len max_seq_len = 10 output = sequence_generator.generate_from_seed(model=model, seed=seed_seq, sequence_length=max_seq_len, data_variance=X_var, data_mean=X_mean) 20
  • 21. Challenges!  The music that is generated has to be unique without infringing the copyrights.  It has to sound good, and what sounds good is very subjective.  Unlike traditional math problem, this cannot be solved by set of formulae.  It takes lot of time to train the model in order to make it generate good music. 21
  • 22. How long does it take? Well it requires time…  After about 100 iterations over 20 different music files.  After 2000 iterations over 20 different music files. 22
  • 23. Python Libraries used: • Keras version 0.1.0 with Theano as the backend. • NumPy and SciPy for various mathematical computation on tensors. • Matplotlib for visualizing the input. • LAME and SoX to convert mp3 files into other formats such as wav. 23
  • 24. The entire code is on GitHub. https://github.com/unnati-xyz/music-generation 24
  • 25. Key notes for using this code to generate your own music: Step 1: Converting the given mp3 files into np tensors. python convert_directory.py // creates YourMusicLibraryNP_x.npy, YourMusicLibraryNP_y.npy Step 2: Training the model. python train.py // creates LSTM model Step 3: Generating the music. python generate.py // final generated music which is stored in a file named generated_song.wav. 25
  • 26. References:  http://colah.github.io/posts/2015-08-Understanding-LSTMs/  http://karpathy.github.io/2015/05/21/rnn-effectiveness/  https://cs224d.stanford.edu/reports/NayebiAran.pdf  https://www.quora.com/What-is-machine-learning-in- laymans-terms-1  http://www.hexahedria.com/2015/08/03/composing-music- with-recurrent-neural-networks/