Skip to content

Instantly share code, notes, and snippets.

@thewh1teagle
Created December 5, 2024 03:54
Show Gist options
  • Select an option

  • Save thewh1teagle/baf2dad4720f653ad8c4491eaed40a9b to your computer and use it in GitHub Desktop.

Select an option

Save thewh1teagle/baf2dad4720f653ad8c4491eaed40a9b to your computer and use it in GitHub Desktop.
Emotion detection in audio
'''
Using the emotion representation model
rec_result only contains {'feats'}
granularity="utterance": {'feats': [*768]}
granularity="frame": {feats: [T*768]}
python main.py
'''
from funasr import AutoModel
import json
from collections import OrderedDict
# Load the finetuned emotion recognition model
model = AutoModel(model="iic/emotion2vec_base_finetuned")
mapper = ["angry", "disgusted", "fearful", "happy", "neutral", "other", "sad", "surprised", "unknown"]
wav_file = f"audio.wav"
rec_result = model.generate(wav_file, granularity="utterance")
scores = rec_result[0]['scores']
# Prepare the result mapping with emotions and their probabilities
result = {emotion: float(prob) for emotion, prob in zip(mapper, scores)}
# Sort the result in descending order of probability
sorted_result = OrderedDict(sorted(result.items(), key=lambda item: item[1], reverse=True))
print(json.dumps(sorted_result, indent=4))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment