Skip to content

Instantly share code, notes, and snippets.

View shangeth's full-sized avatar
🏠
Working from home

Shangeth shangeth

🏠
Working from home
View GitHub Profile
@shangeth
shangeth / download_common_voice_16.py
Created March 6, 2024 02:27
This Python script automates downloading and extracting .tar files from the Common Voice dataset on Hugging Face, using a Hugging Face token for authorization. It creates directories based on set types (e.g., "test"), downloads specified .tar files, extracts their contents, and cleans up by removing the .tar files post-extraction. Ideal for res…
import requests
import os
import tarfile
# Hugging Face token
hf_token = "<HF_TOKEN_HERE>"
headers = {"Authorization": f"Bearer {hf_token}"}
# Directory to save and extract files