Created
March 17, 2024 02:24
-
-
Save pszemraj/09dc7c8bfa6cc4c663e4e02c7b6a1518 to your computer and use it in GitHub Desktop.
hf datasets create a Dataset from a list of dicts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from datasets import Dataset | |
# Your initial list of dictionaries | |
data = [ | |
{"id": 1, "text": "Hello world!", "label": 0}, | |
{"id": 2, "text": "How are you?", "label": 1}, | |
# Add more dictionaries as needed | |
] | |
# Convert list of dictionaries to a dictionary of lists | |
data_dict = {key: [dic[key] for dic in data] for key in data[0]} | |
# Convert the dictionary of lists into an Hugging Face dataset | |
dataset = Dataset.from_dict(data_dict) | |
# Show the dataset to verify its structure | |
print(dataset) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment