Skip to content

Instantly share code, notes, and snippets.

@manisnesan
Last active October 30, 2021 18:14
Show Gist options
  • Save manisnesan/b6b60bb3aa00921a71db72859c01ded5 to your computer and use it in GitHub Desktop.
Save manisnesan/b6b60bb3aa00921a71db72859c01ded5 to your computer and use it in GitHub Desktop.
Awesome Benchmarks

Benchmarks

Natural Language Understanding Systems

  • GLUE : General Language Understanding Evaluation (GLUE) benchmark is a collection of resources for training, evaluating, and analyzing natural language understanding systems. This is a benchmark of nine sentence- or sentence-pair language understanding tasks.

Multitask Challenge for NLP

  • decaNLP: Natural Language Decathlon, a new benchmark for studying general NLP models that can perform a variety of complex, natural language tasks. By requiring a single system to perform ten disparate natural language tasks, decaNLP offers a unique setting for multitask, transfer, and continual learning.

Approximate Nearest Neighbor (ANN)

  • Nearest Neighbors Benchmarks - Collection of datasets pre-split into train/test with ground truth data for top 100 neighbors. Example Datasets: Fashion-MNIST, MNIST, Last.fm. Evaluated algorithms are Vespa, OpenSearch KNN, pyNNDescent, FAISS, hnswlib(nmslib), Annoy, Milvus

Related

  • HF Datasets for NLP - ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
  • fastai datasets external - Contains datasets for medical imaging, audio, image localization, nlp, image classification etc.

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment