Skip to content

Instantly share code, notes, and snippets.

@gangliao
Created April 17, 2019 11:35
Show Gist options
  • Save gangliao/f1ee6cc845b5c1fa2bddd191693c6936 to your computer and use it in GitHub Desktop.
Save gangliao/f1ee6cc845b5c1fa2bddd191693c6936 to your computer and use it in GitHub Desktop.
- protocol directory contains protocol files in ASCII format for training and development.
protocol/ASVspoof2017_train.trn.txt: contains a file list to be used to train human and spoofed speech detectors.
protocol/ASVspoof2017_dev.trl.txt : contains a development trial list to validate the spoofing detector.
The two file have the same format and the meaning of each column is listed as follows:
1st column: unique file ID
2nd column: speech type identifier: genuine means the trial is original speech; spoof means the file is created with replay attack.
3rd column: speaker ID
4th column: RedDots common phrase ID
5th column: Environment ID ('-' for genuine speech)
6th column: Playback device ID ('-' for genuine speech)
7th column: Recording device ID ('-' for genuine speech)
The file for each training sample can be located at ASVspoof2017_train/unique_file_ID.wav
The file for each development trial sample can be located at ASVspoof2017_devunique_file_ID.wav
The following is the details of the RedDots common phrase IDs (4th column):
S01: 'My voice is my password'
S02: 'OK Google'
S03: 'Only lawyers love millionaires'
S04: 'Artificial intelligence is for real'
S05: 'Birthday parties have cupcakes and ice cream'
S06: 'Actions speak louder than words'
S07: 'There is no such thing as a free lunch'
S08: 'A watched pot never boils'
S09: 'Jealousy has twenty-twenty vision'
S10: 'Necessity is the mother of invention'
Here is the details of the IDs of Recording Environment, Playback devices and Recording Devices.
Recording Environment (5th column):
E1: 'Balcony'
E2: 'Bedroom'
E3: 'Cantine'
E4: 'Home'
E5: 'Office'
E6: 'Open Lab Space'
Playback Device (6th column):
P1: '"ACER ""Ferrari ONE"" netbook"'
P2: 'All-in-one PC speakers'
P3: 'BQ Aquaris M5 smarphone'
P4: 'Beyerdynamic DT 770 PRO headphones connected to desktop'
P5: 'Creative A60 connected to laptop'
P6: 'DELL (SSD) notebook+EdirolUA25+XXX'
P7: 'Dell laptop with internal speakers'
P8: 'Dynaudio BM5A Speaker connected to laptop'
P9: 'HP Laptop speakers'
P10: 'High Quality GENELEC Studio Monitors Speakers'
P11: 'MacBook pro internal speakers'
P12: 'PC with portable speaker(Altec lansing Orbit USB iML227)'
P13: 'Samsung GT-I9100'
P14: 'Samsung GT-P6200'
P15: 'VIFA M10MD-39-08 Speaker connected to laptop'
Recording Device (7th column):
R1: 'AKG C562CM + Marantz PMD670'
R2: 'BQ Aquaris M5 smartphone. Software: Smart voice recorder'
R3: 'Desktop Computer with headset and arecord'
R4: 'H6 Handy Recorder'
R5: 'Logitech C920 connected to Dell (SSD) notebook'
R6: 'Nokia Lumia'
R7: 'Rode NT2 microphone connected to laptop'
R8: 'Rode smartlav+ microphone connected to laptop'
R9: 'Samsung GT-I9100'
R10: 'Samsung GT-P6200'
R11: 'Samsung Galaxy 7s'
R12: 'Samsung Trend 2'
R13: 'Samsung Trend 3'
R14: 'ZoomHD1'
R15: 'iPhone 5c'
R16: 'iphone4'
Participants can use the information in the 3-7th column to tune their systems. However, the final performance is only based on the 2nd column (classification performance of human or spoof).
Then participants can evaluate their own models using evaluation set.
1. ASVspoof2017_eval directory contains audio files used for evaluation (E_*.wav). The waveforms in the directories are in the standard RIFF/WAVE format. The sampling rate is 16 kHz, and stored in 16-bit format.
2. protocol directory contains protocol and key files in ASCII format for evaluation.
protocol/ASVspoof2017_eval_v2.trl.txt: contains a list of audio files to be used for the evaluation purpose.
protocol/ASVspoof2017_eval_v2_key.trl.txt : contains ground-truth labels of the evaluation data.
The meaning of each column of the protocol file is listed as follows:
1st column: unique file ID
2nd column: RedDots common phrase ID
The meaning of each column of the key file is listed as follows:
1st column: unique file ID
2nd column: ground-truth labels (genuine or spoof)
The following is the details of the RedDots common phrase IDs (4th column):
S01: 'My voice is my password'
S02: 'OK Google'
S03: 'Only lawyers love millionaires'
S04: 'Artificial intelligence is for real'
S05: 'Birthday parties have cupcakes and ice cream'
S06: 'Actions speak louder than words'
S07: 'There is no such thing as a free lunch'
S08: 'A watched pot never boils'
S09: 'Jealousy has twenty-twenty vision'
S10: 'Necessity is the mother of invention'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment