Last active
October 13, 2023 09:55
-
-
Save ketanhdoshi/7344ca248e3b1d323767c88ce5f529bd to your computer and use it in GitHub Desktop.
Transform SpecAugment on the Mel Spectrogram
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ---------------------------- | |
# Augment the Spectrogram by masking out some sections of it in both the frequency | |
# dimension (ie. horizontal bars) and the time dimension (vertical bars) to prevent | |
# overfitting and to help the model generalise better. The masked sections are | |
# replaced with the mean value. | |
# ---------------------------- | |
@staticmethod | |
def spectro_augment(spec, max_mask_pct=0.1, n_freq_masks=1, n_time_masks=1): | |
_, n_mels, n_steps = spec.shape | |
mask_value = spec.mean() | |
aug_spec = spec | |
freq_mask_param = max_mask_pct * n_mels | |
for _ in range(n_freq_masks): | |
aug_spec = transforms.FrequencyMasking(freq_mask_param)(aug_spec, mask_value) | |
time_mask_param = max_mask_pct * n_steps | |
for _ in range(n_time_masks): | |
aug_spec = transforms.TimeMasking(time_mask_param)(aug_spec, mask_value) | |
return aug_spec |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment