Train and evaluate two Support Vector Regression models for prediction of arousal and valence. Use Essentia's MusicExtractor to compute summarized descriptor values (no frames). Use Scikit-learn for Support Vector Regression. It will be useful to pre-process all features first, standardizing them to zero mean and unit variance.
Write a report in form of a python notebook. This script should:
- Analyze all audio in the dataset using MusicExtractor and store aggredated analysis results in json files
- Load descriptors computed for all files, and load arousal/valence annotations
- You can first convert json to CSV if you like
- Use
lowlevel.*
,rhythm.*
, andtonal.*
descriptors - For summarized descriptors only consider
*.mean
and*.stdev
- Ignore non-numerical descriptors
- Standardize all features
- Split dataset into two subsets:
- tracks 1-1700 for training
- tracks 1701-... for testing
- Train model on training subset
- Predict arousal/valence for testing subset and evaluate predictions using regression metrics
Report and discuss evaluation results.
A python notebook with code and report. Include text explaining all your steps, decisions, and discussion of results (submissions with code only and no explanation are not enough!)
Dataset: https://www.dropbox.com/s/tlxihautrx2hqd0/Essentia_mood_task.zip?dl=0
Alternative link for download: http://essentia.upf.edu/documentation/tmp/Essentia_mood_task.zip
MIR Course Docker image including Essentia python wrapper: https://github.com/MTG/MIRCourse
- List of music features computed by MusicExtractor: http://essentia.upf.edu/documentation/streaming_extractor_music.html
- http://scikit-learn.org/
- http://scikit-learn.org/stable/modules/svm.html#regression
- http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html
- http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html#sklearn.svm.SVR.fit
- http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html#sklearn.svm.SVR.predict
- http://scikit-learn.org/stable/modules/preprocessing.html#standardization-or-mean-removal-and-variance-scaling
- http://scikit-learn.org/stable/modules/model_evaluation.html#regression-metrics
- http://essentia.upf.edu/documentation/FAQ.html#converting-descriptor-files-to-csv
Train and evaluate Support Vector Machine (SVM) for prediction of genre. Use Essentia's MusicExtractor to compute summarized descriptor values (no frames). Use Scikit-learn for SVM classifier. It will be useful to pre-process all features first, standardizing them to zero mean and unit variance.
Write a report in form of a python notebook. This script should:
- Analyze all audio in the dataset using MusicExtractor and store aggredated analysis results in json files
- Load descriptors computed for all files, and load genre annotations (all audio files are inside folder names that correspond to genres)
- You can first convert json to CSV if you like
- Use
lowlevel.*
,rhythm.*
, andtonal.*
descriptors - For summarized descriptors only consider
*.mean
and*.stdev
- Ignore non-numerical descriptors
- Standardize all features
- Split dataset into two balanced subsets: 80% for training, 20% for testing (make sure that the testing subset contains equal amount of tracks by each genre)
- Train model on training subset
- Predict genre for testing subset and evaluate accuracy of predictions (% of correct predictions). Compute classification accuracy and the confusion matrix.
Report and discuss evaluation results.
A python notebook with code and report. Include text explaining all your steps, decisions, and discussion of results (submissions with code only and no explanation are not enough!)
Dataset: https://www.dropbox.com/s/gljckh3nff55r9c/Essentia_genre_task.zip?dl=0
Alternative link for download: http://essentia.upf.edu/documentation/tmp/Essentia_genre_task.zip
MIR Course Docker image including Essentia python wrapper: https://github.com/MTG/MIRCourse
- List of music features computed by MusicExtractor: http://essentia.upf.edu/documentation/streaming_extractor_music.html
- http://scikit-learn.org/
- http://scikit-learn.org/stable/modules/svm.html#classification
- http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
- http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC.fit
- http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC.predict
- http://scikit-learn.org/stable/modules/preprocessing.html#standardization-or-mean-removal-and-variance-scaling
- http://scikit-learn.org/stable/modules/model_evaluation.html#accuracy-score
- http://scikit-learn.org/stable/modules/model_evaluation.html#confusion-matrix
- http://essentia.upf.edu/documentation/FAQ.html#converting-descriptor-files-to-csv
Analyze 20 various music tracks from your collection. Split tracks into segments by multiple beats (segments of 1 beat, 2 beats, 4 beats, 8 beats). Compute average MFCC mean/stdev values for each segment using Essentia and cluster segments using Scikit-learn (discard the 0th MFCC coefficient as it is correlated to loudness, not timbre). Decide which number of segments is appropriate. For each cluster generate audio file with corresponding segments. Number of beats in a segment and number of clusters to consider should be configurable in the script.
Write python script using Essentia that:
- Loads audio
- Estimates beats positions
- Cuts audio into segments (Slicer algorithm)
- Computes MFCC frames in each segment and summarizes frames to mean/stdev values (using PoolAggretator)
- Clusters segments using Scikit-learn (k-means clustering)
- Writes audio files with segments from each cluster (using AudioWriter)
A python notebook with code and report. Audio files with segments for each cluster. Include text explaining all your steps, decisions, and discussion of results (submissions with code only and no explanation are not enough!)
MIR Course Docker image including Essentia python wrapper: https://github.com/MTG/MIRCourse