Created
June 11, 2017 15:07
-
-
Save rnirmal/f3afd35401d788e5256bea03c72b4954 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
datasourceName | about | link | tool | categoryName | vintage | |
---|---|---|---|---|---|---|
1000 Genomes | 1000 Genomes | http://www.1000genomes.org/data | Biology | NA | ||
CCLE | Broad Cancer Cell Line Encyclopedia (CCLE) | http://www.broadinstitute.org/ccle/home | Biology | NA | ||
BBBC | Broad Bioimage Benchmark Collection (BBBC) | https://www.broadinstitute.org/bbbc | Biology | NA | ||
Cell Image | Cell Image Library | http://www.cellimagelibrary.org | Biology | NA | ||
Complete Genomics | Complete Genomics Public Data | http://www.completegenomics.com/public-data/69-genomes/ | Biology | NA | ||
EBI ArrayExpress | EBI ArrayExpress | http://www.ebi.ac.uk/arrayexpress/ | Biology | NA | ||
EBI Protein | EBI Protein Data Bank in Europe | http://www.ebi.ac.uk/pdbe/emdb/index.html/ | Biology | NA | ||
EMPIAR | Electron Microscopy Pilot Image Archive (EMPIAR) | http://www.ebi.ac.uk/pdbe/emdb/empiar/ | Biology | NA | ||
ENCODE project | ENCODE project | https://www.encodeproject.org | Biology | NA | ||
Ensembl Genomes | Ensembl Genomes | http://ensemblgenomes.org/info/genomes | Biology | NA | ||
GEO | Gene Expression Omnibus (GEO) | http://www.ncbi.nlm.nih.gov/geo/ | Biology | NA | ||
GO | Gene Ontology (GO) | http://geneontology.org/page/download-annotations | Biology | NA | ||
HMS | Harvard Medical School (HMS) LINCS Project | http://lincs.hms.harvard.edu | Biology | NA | ||
Human Genome | Human Genome Diversity Project | http://www.hagsc.org/hgdp/files.html | Biology | NA | ||
HMP | Human Microbiome Project (HMP) | http://www.hmpdacc.org/reference_genomes/reference_genomes.php | Biology | NA | ||
ICOS PSP | ICOS PSP Benchmark | http://ico2s.org/datasets/psp_benchmark.html | Biology | NA | ||
International HapMap | International HapMap Project | http://hapmap.ncbi.nlm.nih.gov/downloads/index.html.en | Biology | NA | ||
Journal of | Journal of Cell Biology DataViewer | http://jcb-dataviewer.rupress.org | Biology | NA | ||
MIT Cancer | MIT Cancer Genomics Data | http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi | Biology | NA | ||
NCBI Proteins | NCBI Proteins | http://www.ncbi.nlm.nih.gov/guide/proteins/#databases | Biology | NA | ||
NCBI Taxonomy | NCBI Taxonomy | http://www.ncbi.nlm.nih.gov/taxonomy | Biology | NA | ||
OpenSNP genotypes | OpenSNP genotypes data | https://opensnp.org/ | Biology | NA | ||
Pathguid | Pathguid - Protein-Protein Interactions Catalog | http://www.pathguide.org/ | Biology | NA | ||
Protein Data | Protein Data Bank | http://www.rcsb.org/ | Biology | NA | ||
Psychiatric Genomics | Psychiatric Genomics Consortium | https://www.med.unc.edu/pgc/downloads | Biology | NA | ||
PubChem Project | PubChem Project | https://pubchem.ncbi.nlm.nih.gov/ | Biology | NA | ||
now Coremine Medical | PubGene (now Coremine Medical) | http://www.pubgene.org/ | Biology | NA | ||
COSMIC | Sanger Catalogue of Somatic Mutations in Cancer (COSMIC) | http://cancer.sanger.ac.uk/cosmic | Biology | NA | ||
GDSC | Sanger Genomics of Drug Sensitivity in Cancer Project (GDSC) | http://www.cancerrxgene.org/ | Biology | NA | ||
SRA | Sequence Read Archive(SRA) | http://www.ncbi.nlm.nih.gov/Traces/sra/ | Biology | NA | ||
Stanford Microarray | Stanford Microarray Data | http://smd.stanford.edu/ | Biology | NA | ||
Stowers Institute | Stowers Institute Original Data Repository | http://www.stowers.org/research/publications/odr | Biology | NA | ||
SSBD | Systems Science of Biological Dynamics (SSBD) Database | http://ssbd.qbic.riken.jp | Biology | NA | ||
TCGA | The Cancer Genome Atlas (TCGA), available via Broad GDAC | https://gdac.broadinstitute.org/ | Biology | NA | ||
The Catalogue | The Catalogue of Life | http://www.catalogueoflife.org/content/annual-checklist-archive | Biology | NA | ||
Personal Genome | Personal Genome Project: Harvard | http://www.personalgenomes.org/ | Biology | NA | ||
UCSC Public | UCSC Public Data | http://hgdownload.soe.ucsc.edu/downloads.html | Biology | NA | ||
UnitProt | Universal Protein Resource (UnitProt) | http://www.uniprot.org/downloads | Biology | NA | ||
UniGene | UniGene | http://www.ncbi.nlm.nih.gov/unigene | Biology | NA | ||
Australian Weather | Australian Weather | http://www.bom.gov.au/climate/dwo/ | Climate/Weather | NA | ||
Aviation Weather Center | Aviation Weather Center - Consistent, timely and accurate weather information for the world airspace system | https://aviationweather.gov/adds/dataserver | Climate/Weather | NA | ||
In Portuguese | Brazilian Weather - Historical data (In Portuguese) | http://sinda.crn2.inpe.br/PCD/SITE/novo/site/ | Climate/Weather | NA | ||
Canadian Meteorological | Canadian Meteorological Centre | http://weather.gc.ca/grib/index_e.html | Climate/Weather | NA | ||
updated monthly | Climate Data from UEA (updated monthly) | https://crudata.uea.ac.uk/cru/data/temperature/#datterandftp://ftp.cmdl.noaa.gov/ | Climate/Weather | NA | ||
European Climate | European Climate Assessment & Dataset | http://eca.knmi.nl/ | Climate/Weather | NA | ||
NASA Global | NASA Global Imagery Browse Services | https://wiki.earthdata.nasa.gov/display/GIBS | Climate/Weather | NA | ||
NOAA Bering | NOAA Bering Sea Climate | http://www.beringclimate.noaa.gov/ | Climate/Weather | NA | ||
NOAA Climate | NOAA Climate Datasets | http://www.ncdc.noaa.gov/data-access/quick-links | Climate/Weather | NA | ||
NOAA Realtime | NOAA Realtime Weather Models | http://www.ncdc.noaa.gov/data-access/model-data/model-datasets/numerical-weather-prediction | Climate/Weather | NA | ||
The World | The World Bank Open Data Resources for Climate Change | http://data.worldbank.org/developers/climate-data-api | Climate/Weather | NA | ||
UEA Climatic | UEA Climatic Research Unit | http://www.cru.uea.ac.uk/data | Climate/Weather | NA | ||
WorldClim | WorldClim - Global Climate Data | http://www.worldclim.org | Climate/Weather | NA | ||
WU Historical | WU Historical Weather Worldwide | https://www.wunderground.com/history/index.html | Climate/Weather | NA | ||
AMiner Citation | AMiner Citation Network Dataset | http://aminer.org/citation | Complex Networks | NA | ||
CrossRef DOI | CrossRef DOI URLs | https://archive.org/details/doi-urls | Complex Networks | NA | ||
DBLP Citation | DBLP Citation dataset | https://kdl.cs.umass.edu/display/public/DBLP | Complex Networks | NA | ||
NBER Patent | NBER Patent Citations | http://nber.org/patents/ | Complex Networks | NA | ||
Network Repository Graph | Network Repository with Interactive Exploratory Analysis Tools | http://networkrepository.com/graph-vis.php | Data Explorer | Complex Networks | NA | |
NIST complex | NIST complex networks data collection | http://math.nist.gov/~RPozo/complex_datasets.html | Complex Networks | NA | ||
Protein-protein interaction | Protein-protein interaction network | http://vlado.fmf.uni-lj.si/pub/networks/data/bio/Yeast/Yeast.htm | Complex Networks | NA | ||
PyPI and | PyPI and Maven Dependency Network | https://ogirardot.wordpress.com/2013/01/31/sharing-pypimaven-dependency-data/ | Complex Networks | NA | ||
Scopus Citation | Scopus Citation Database | https://www.elsevier.com/solutions/scopus | Complex Networks | NA | ||
Small Network | Small Network Data | http://www-personal.umich.edu/~mejn/netdata/ | Complex Networks | NA | ||
Steven Skiena | Stanford GraphBase (Steven Skiena) | http://www3.cs.stonybrook.edu/~algorith/implement/graphbase/implement.shtml | Complex Networks | NA | ||
Stanford Large | Stanford Large Network Dataset Collection | http://snap.stanford.edu/data/ | Complex Networks | NA | ||
Stanford Longitudinal | Stanford Longitudinal Network Data Sources | http://stanford.edu/group/sonia/dataSources/index.html | Complex Networks | NA | ||
The Koblenz | The Koblenz Network Collection | http://konect.uni-koblenz.de/ | Complex Networks | NA | ||
UNIMI | The Laboratory for Web Algorithmics (UNIMI) | http://law.di.unimi.it/datasets.php | Complex Networks | NA | ||
The Nexus | The Nexus Network Repository | http://nexus.igraph.org/ | Complex Networks | NA | ||
UCI Network | UCI Network Data Repository | https://networkdata.ics.uci.edu/resources.php | Complex Networks | NA | ||
UFL sparse | UFL sparse matrix collection | http://www.cise.ufl.edu/research/sparse/matrices/ | Complex Networks | NA | ||
WSU Graph | WSU Graph Database | http://www.eecs.wsu.edu/mgd/gdb.html | Complex Networks | NA | ||
DIMACS Road | DIMACS Road Networks Collection | http://www.dis.uniroma1.it/challenge9/download.shtml | Complex Networks | NA | ||
CAIDA Internet | CAIDA Internet Datasets | http://www.caida.org/data/overview/ | Computer Networks | NA | ||
CommonCrawl Web | CommonCrawl Web Data over 7 years | http://commoncrawl.org/the-data/get-started/ | Computer Networks | NA | ||
CRAWDAD Wireless | CRAWDAD Wireless datasets from Dartmouth Univ. | https://crawdad.cs.dartmouth.edu/ | Computer Networks | NA | ||
Open Mobile | Open Mobile Data by MobiPerf | https://console.developers.google.com/storage/openmobiledata_public/ | Computer Networks | NA | ||
Rapid7 Sonar | Rapid7 Sonar Internet Scans | https://sonar.labs.rapid7.com/ | Computer Networks | NA | ||
UCSD Network | UCSD Network Telescope, IPv4 /8 net | http://www.caida.org/projects/network_telescope/ | Computer Networks | NA | ||
Challenges in | Challenges in Machine Learning | http://www.chalearn.org/ | Data Challenges | NA | ||
DrivenData Competitions | DrivenData Competitions for Social Good | http://www.drivendata.org/ | Data Challenges | NA | ||
Kaggle Datasets | Kaggle Competition Datasets | https://www.kaggle.com/datasets | Data Catalog | Data Challenges | NA | |
Netflix Prize | Netflix Prize | http://netflixprize.com/leaderboard.html | Data Challenges | NA | ||
Space Apps | Space Apps Challenge | https://2015.spaceappschallenge.org | Data Challenges | NA | ||
Telecom Italia | Telecom Italia Big Data Challenge | https://dandelion.eu/datamine/open-big-data/ | Data Challenges | NA | ||
AQUASTAT | AQUASTAT - Global water resources and uses | http://www.fao.org/nr/water/aquastat/data/query/index.html?lang=en | Earth Science | NA | ||
BODC | BODC - marine data of ~22K vars | http://www.bodc.ac.uk/data/where_to_find_data/ | Earth Science | NA | ||
Earth Models | Earth Models | http://www.earthmodels.org/ | Earth Science | NA | ||
EOSDIS | EOSDIS - NASA's earth observing system data | http://sedac.ciesin.columbia.edu/data/sets/browse | Earth Science | NA | ||
AODN | The gateway to Australian marine and climate science data | https://portal.aodn.org.au/search | Earth Science | NA | ||
Marinexplore | Marinexplore - Open Oceanographic Data | http://marinexplore.org/ | Earth Science | NA | ||
Smithsonian Institution | Smithsonian Institution Global Volcano and Eruption Database | http://volcano.si.edu/ | Earth Science | NA | ||
USGS Earthquake | USGS Earthquake Archives | http://earthquake.usgs.gov/earthquakes/search/ | Earth Science | NA | ||
AEA | American Economic Association (AEA) | https://www.aeaweb.org/resources/data | Economics | NA | ||
EconData from | EconData from UMD | http://inforumweb.umd.edu/econdata/econdata.html | Economics | NA | ||
Economic Freedom | Economic Freedom of the World Data | http://www.freetheworld.com/datasets_efw.html | Economics | NA | ||
Historical MacroEconomc | Historical MacroEconomc Statistics | http://www.historicalstatistics.org/ | Economics | NA | ||
IEDB | International Economics Database | http://widukind.cepremap.org/views/explorer | Data Catalog | Economics | NA | |
International Trade | International Trade Statistics | http://www.econostatistics.co.za/ | Economics | NA | ||
Joint External | Joint External Debt Data Hub | http://www.jedh.org/ | Economics | NA | ||
Jon Haveman | Jon Haveman International Trade Data Links | http://www.macalester.edu/research/economics/PAGE/HAVEMAN/Trade.Resources/TradeData.html | Economics | NA | ||
OpenCorporates Database | OpenCorporates Database of Companies in the World | https://opencorporates.com/ | Database Search | Economics | NA | |
Our World | Our World in Data | http://ourworldindata.org/ | Economics | NA | ||
SciencesPo World | SciencesPo World Trade Gravity Datasets | http://econ.sciences-po.fr/thierry-mayer/data | Economics | NA | ||
The Atlas | The Atlas of Economic Complexity | http://atlas.cid.harvard.edu | Economics | NA | ||
The Center | The Center for International Data | http://cid.econ.ucdavis.edu | Economics | NA | ||
The Observatory | The Observatory of Economic Complexity | http://atlas.media.mit.edu/en/ | Economics | NA | ||
UN Commodity | UN Commodity Trade Statistics | http://comtrade.un.org/db/ | Economics | NA | ||
UN Human | UN Human Development Reports | http://hdr.undp.org/en | Economics | NA | ||
Student Data | Student Data from Free Code Camp | http://academictorrents.com/details/030b10dad0846b5aecc3905692890fb02404adbf | Education | NA | ||
AMPds | AMPds | http://ampds.org/ | Energy | NA | ||
BLUEd | BLUEd | http://nilm.cmubi.org/ | Energy | NA | ||
COMBED | COMBED | http://combed.github.io/ | Energy | NA | ||
Dataport | Dataport | https://dataport.pecanstreet.org/ | Energy | NA | ||
DRED | DRED | http://www.st.ewi.tudelft.nl/~akshay/dred/ | Energy | NA | ||
ECO | ECO | http://www.vs.inf.ethz.ch/res/show.html?what=eco-data | Energy | NA | ||
EIA | EIA | http://www.eia.gov/electricity/data/eia923/ | Energy | NA | ||
HFED | HFED | http://hfed.github.io/ | Energy | NA | ||
iAWE | iAWE | http://iawe.github.io/ | Energy | NA | ||
REDD | REDD | http://redd.csail.mit.edu/ | Energy | NA | ||
Tracebase | Tracebase | https://www.tracebase.org | Energy | NA | ||
WHITED | WHITED | http://nilmworkshop.org/2016/proceedings/Poster_ID18.pdf | Energy | NA | ||
CBOE Futures | CBOE Futures Exchange | http://cfe.cboe.com/Data/ | Finance | NA | ||
Google Finance | Google Finance | https://www.google.com/finance | Finance | NA | ||
Google Trends | Google Trends | http://www.google.com/trends?q=google&ctab=0&geo=all&date=all&sort=0 | Finance | NA | ||
NASDAQ Data-on-demand | NASDAQ Data-on-demand | http://www.nasdaqdod.com/ | Data API | Finance | NA | |
OANDA Developer | OANDA Developer | http://developer.oanda.com/ | Data API | Finance | NA | |
OSU Financial | OSU Financial data | http://fisher.osu.edu/fin/fdf/osudata.htm | Finance | NA | ||
Quandl | Quandl | https://www.quandl.com/ | Data Catalog | Finance | NA | |
St Louis | St Louis Federal | https://research.stlouisfed.org/fred2/ | Finance | NA | ||
Cambridge GIS | Cambridge, MA, US, GIS data on GitHub | http://cambridgegis.github.io/gisdata.html | GIS | NA | ||
Factual Global | Factual Global Location Data | https://www.factual.com/ | Data API | GIS | NA | |
Geo Spatial | Geo Spatial Data from ASU | http://geodacenter.asu.edu/datalist/ | GIS | NA | ||
Geo Wiki Project | Geo Wiki Project - Citizen-driven Environmental Monitoring | http://geo-wiki.org/ | GIS | NA | ||
GeoFabrik | GeoFabrik - OSM data extracted to a variety of formats and areas | http://download.geofabrik.de/ | GIS | NA | ||
GeoNames Worldwide | GeoNames Worldwide | http://www.geonames.org/ | GIS | NA | ||
GADM | Global Administrative Areas Database (GADM) | http://www.gadm.org/ | GIS | NA | ||
Homeland Infrastructure | Homeland Infrastructure Foundation-Level Data | https://hifld-dhs-gii.opendata.arcgis.com/ | Data Catalog | GIS | NA | |
National Weather | National Weather Service GIS Data Portal | http://www.nws.noaa.gov/gis/ | GIS | NA | ||
Natural Earth | Natural Earth - vectors and rasters of the world | http://www.naturalearthdata.com/ | GIS | NA | ||
OpenAddresses | OpenAddresses | http://openaddresses.io/ | GIS | NA | ||
OSM | OpenStreetMap (OSM) | http://wiki.openstreetmap.org/wiki/Downloading_data | GIS | NA | ||
Pleiades | Pleiades - Gazetteer and graph of ancient places | http://pleiades.stoa.org/ | GIS | NA | ||
TIGER/Line | TIGER/Line - U.S. boundaries and roads | http://www.census.gov/geo/maps-data/data/tiger-line.html | GIS | NA | ||
TZ Timezones | TZ Timezones shapfiles | http://efele.net/maps/tz/world/ | GIS | NA | ||
UN Environmental | UN Environmental Data | http://geodata.grid.unep.ch/ | GIS | NA | ||
World boundaries | World boundaries from the U.S. Department of State | https://hiu.state.gov/data/data.aspx | GIS | NA | ||
OpenDataSoft's list | OpenDataSoft's list of 1,600 open data portals | https://www.opendatasoft.com/a-comprehensive-list-of-all-open-data-portals-around-the-world/ | Government | NA | ||
EHDP Large | EHDP Large Health Data Sets | http://www.ehdp.com/vitalnet/datasets.htm | Healthcare | NA | ||
Gapminder World | Gapminder World demographic databases | http://www.gapminder.org/data/ | Healthcare | NA | ||
MCD | Medicare Coverage Database (MCD), U.S. | https://www.cms.gov/medicare-coverage-database/ | Healthcare | NA | ||
Medicare Data | Medicare Data Engine of medicare.gov Data | https://data.medicare.gov/ | Healthcare | NA | ||
MeSH, the | MeSH, the vocabulary thesaurus used for indexing articles for PubMed | https://www.nlm.nih.gov/mesh/filelist.html | Healthcare | NA | ||
structure of the UK NHS | Open-ODS (structure of the UK NHS) | http://www.openods.co.uk | Healthcare | NA | ||
OpenPaymentsData, Healthcare | OpenPaymentsData, Healthcare financial relationship data | https://openpaymentsdata.cms.gov | Healthcare | NA | ||
GDC | NCI's Genomic Data Commons (GDC) | https://gdc.cancer.gov/ | Healthcare | NA | ||
World Health | World Health Organization Global Health Observatory | http://www.who.int/gho/en/ | Healthcare | NA | ||
Affective Image | Affective Image Classification | http://www.imageemotion.org/ | Image Processing | NA | ||
Animals with | Animals with attributes | http://attributes.kyb.tuebingen.mpg.de/ | Image Processing | NA | ||
Face Recognition | Face Recognition Benchmark | http://www.face-rec.org/databases/ | Image Processing | NA | ||
in WordNet hierarchy | ImageNet (in WordNet hierarchy) | http://www.image-net.org/ | Image Processing | NA | ||
Indoor Scene | Indoor Scene Recognition | http://web.mit.edu/torralba/www/indoor.html | Image Processing | NA | ||
International Affective | International Affective Picture System, UFL | http://csea.phhp.ufl.edu/media/iapsmessage.html | Image Processing | NA | ||
Massive Visual | Massive Visual Memory Stimuli, MIT | http://cvcl.mit.edu/MM/stimuli.html | Image Processing | NA | ||
Several Shape-from-Silhouette | Several Shape-from-Silhouette Datasets | http://kaiwolf.no-ip.org/3d-model-repository.html | Image Processing | NA | ||
Stanford Dogs | Stanford Dogs Dataset | http://vision.stanford.edu/aditya86/ImageNetDogs/ | Image Processing | NA | ||
SUN database, | SUN database, MIT | http://groups.csail.mit.edu/vision/SUN/hierarchy.html | Image Processing | NA | ||
The Oxford-IIIT | The Oxford-IIIT Pet Dataset | http://www.robots.ox.ac.uk/~vgg/data/pets/ | Image Processing | NA | ||
YouTube Faces | YouTube Faces Database | http://www.cs.tau.ac.il/~wolf/ytfaces/ | Image Processing | NA | ||
Adience Unfiltered | Adience Unfiltered faces for gender and age classification | http://www.openu.ac.il/home/hassner/Adience/data.html | Image Processing | NA | ||
ASLAN | The Action Similarity Labeling (ASLAN) Challenge | http://www.openu.ac.il/home/hassner/data/ASLAN/ASLAN.html | Image Processing | NA | ||
Violent-Flows | Violent-Flows - Crowd Violence Non-violence Database and benchmark | http://www.openu.ac.il/home/hassner/data/violentflows/ | Image Processing | NA | ||
Univ. of Toronto | Delve Datasets for classification and regression (Univ. of Toronto) | http://www.cs.toronto.edu/~delve/data/datasets.html | Machine Learning | NA | ||
Discogs Monthly | Discogs Monthly Data | http://data.discogs.com/ | Machine Learning | NA | ||
IMDb Database | IMDb Database | http://www.imdb.com/interfaces | Machine Learning | NA | ||
Keel Repository | Keel Repository for classification, regression and time series | http://sci2s.ugr.es/keel/datasets.php | Machine Learning | NA | ||
LFW | Labeled Faces in the Wild (LFW) | http://vis-www.cs.umass.edu/lfw/ | Machine Learning | NA | ||
Lending Club | Lending Club Loan Data | https://www.lendingclub.com/info/download-data.action | Machine Learning | NA | ||
Machine Learning | Machine Learning Data Set Repository | http://mldata.org/ | Machine Learning | NA | ||
Million Song | Million Song Dataset | http://labrosa.ee.columbia.edu/millionsong/ | Machine Learning | NA | ||
More Song | More Song Datasets | http://labrosa.ee.columbia.edu/millionsong/pages/additional-datasets | Machine Learning | NA | ||
MovieLens Data | MovieLens Data Sets | http://grouplens.org/datasets/movielens/ | Machine Learning | NA | ||
RDataMining | RDataMining - R and Data Mining ebook data | http://www.rdatamining.com/data | Machine Learning | NA | ||
Registered Meteorites | Registered Meteorites on Earth | http://healthintelligence.drupalgardens.com/content/registered-meteorites-has-impacted-earth-visualized | Machine Learning | NA | ||
Restaurants Health | Restaurants Health Score Data in San Francisco | http://missionlocal.org/san-francisco-restaurant-health-inspections/ | Machine Learning | NA | ||
UCI Machine | UCI Machine Learning Repository | http://archive.ics.uci.edu/ml/ | Machine Learning | NA | ||
Yahoo! Ratings | Yahoo! Ratings and Classification Data | http://webscope.sandbox.yahoo.com/catalog.php?datatype=r | Machine Learning | NA | ||
Canada Science | Canada Science and Technology Museums Corporation's Open Data | http://techno-science.ca/en/data.php | Museums | NA | ||
London | Natural History Museum (London) Data Portal | http://data.nhm.ac.uk/ | Museums | NA | ||
Rijksmuseum Historical | Rijksmuseum Historical Art Collection | https://www.rijksmuseum.nl/en/api | Museums | NA | ||
The Getty | The Getty vocabularies | http://vocab.getty.edu | Museums | NA | ||
Blogger Corpus | Blogger Corpus | http://u.cs.biu.ac.il/~koppel/BlogCorpus.htm | Natural Language | NA | ||
CLiPS Stylometry | CLiPS Stylometry Investigation Corpus | http://www.clips.uantwerpen.be/datasets/csi-corpus | Natural Language | NA | ||
DBpedia | DBpedia - 4.58M things with 583M facts | http://wiki.dbpedia.org/Datasets | Natural Language | NA | ||
Flickr Personal | Flickr Personal Taxonomies | http://www.isi.edu/~lerman/downloads/flickr/flickr_taxonomies.html | Natural Language | NA | ||
Freebase.com of | Freebase.com of people, places, and things | http://www.freebase.com/ | Natural Language | NA | ||
Gutenberg eBooks | Gutenberg eBooks List | http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs | Natural Language | NA | ||
Hansards text | Hansards text chunks of Canadian Parliament | http://www.isi.edu/natural-language/download/hansard/ | Natural Language | NA | ||
MCTest | Machine Comprehension Test (MCTest) of text from Microsoft Research | http://research.microsoft.com/en-us/um/redmond/projects/mctest/index.html | Natural Language | NA | ||
Machine Translation | Machine Translation of European languages | http://statmt.org/wmt11/translation-task.html#download | Natural Language | NA | ||
Personae Corpus | Personae Corpus | http://www.clips.uantwerpen.be/datasets/personae-corpus | Natural Language | NA | ||
SMS Spam | SMS Spam Collection in English | http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/ | Natural Language | NA | ||
Wikidata | Wikidata - Wikipedia databases | https://www.wikidata.org/wiki/Wikidata:Database_download | Natural Language | NA | ||
Wikipedia Links data | Wikipedia Links data - 40 Million Entities in Context | https://code.google.com/p/wiki-links/downloads/list | Natural Language | NA | ||
Universal Dependencies | Universal Dependencies | http://universaldependencies.org | Natural Language | NA | ||
WordNet databases | WordNet databases and tools | http://wordnet.princeton.edu/wordnet/download/ | Natural Language | NA | ||
Open Multilingual | Open Multilingual Wordnet | http://compling.hss.ntu.edu.sg/omw/ | Natural Language | NA | ||
Allen Institute | Allen Institute Datasets | http://www.brain-map.org/ | Neuroscience | NA | ||
Brain Catalogue | Brain Catalogue | http://braincatalogue.org/ | Neuroscience | NA | ||
Brainomics | Brainomics | http://brainomics.cea.fr/localizer | Neuroscience | NA | ||
CodeNeuro Datasets | CodeNeuro Datasets | http://datasets.codeneuro.org/ | Neuroscience | NA | ||
CRCNS | Collaborative Research in Computational Neuroscience (CRCNS) | http://crcns.org/data-sets | Neuroscience | NA | ||
FCP-INDI | FCP-INDI | http://fcon_1000.projects.nitrc.org/index.html | Neuroscience | NA | ||
Human Connectome | Human Connectome Project | http://www.humanconnectome.org/data/ | Neuroscience | NA | ||
NDAR | NDAR | https://ndar.nih.gov/ | Neuroscience | NA | ||
NIMH Data | NIMH Data Archive | http://data-archive.nimh.nih.gov/ | Neuroscience | NA | ||
NeuroData | NeuroData | http://neurodata.io | Neuroscience | NA | ||
OASIS | OASIS | http://www.oasis-brains.org/ | Neuroscience | NA | ||
OpenfMRI | OpenfMRI | https://openfmri.org/ | Neuroscience | NA | ||
Neuroelectro | Neuroelectro | http://neuroelectro.org/ | Neuroscience | NA | ||
Study Forrest | Study Forrest | http://studyforrest.org | Neuroscience | NA | ||
CERN Open | CERN Open Data Portal | http://opendata.cern.ch/ | Physics | NA | ||
Crystallography Open | Crystallography Open Database | http://www.crystallography.net/ | Physics | NA | ||
NASA Exoplanet | NASA Exoplanet Archive | http://exoplanetarchive.ipac.caltech.edu/ | Physics | NA | ||
NASA | NSSDC (NASA) data of 550 space spacecraft | http://nssdc.gsfc.nasa.gov/nssdc/obtaining_data.html | Physics | NA | ||
SDSS | Sloan Digital Sky Survey (SDSS) - Mapping the Universe | http://www.sdss.org/ | Physics | NA | ||
OSU Cognitive | OSU Cognitive Modeling Repository Datasets | http://www.cmr.osu.edu/browse/datasets | Psychology/Cognition | NA | ||
Amazon | Amazon | http://aws.amazon.com/datasets/ | Data Catalog | Public Domains | NA | |
Archive-it from | Archive-it from Internet Archive | https://www.archive-it.org/explore?show=Collections | Public Domains | NA | ||
Archive.org Datasets | Archive.org Datasets | https://archive.org/details/datasets | Public Domains | NA | ||
CMU JASA | CMU JASA data archive | http://lib.stat.cmu.edu/jasadata/ | Public Domains | NA | ||
CMU StatLab | CMU StatLab collections | http://lib.stat.cmu.edu/datasets/ | Public Domains | NA | ||
Datamob.org | Datamob.org | http://datamob.org/datasets | Public Domains | NA | ||
http://www.google.com/publicdata/directory | Public Domains | NA | ||||
KDNuggets Data | KDNuggets Data Collections | http://www.kdnuggets.com/datasets/index.html | Public Domains | NA | ||
Microsoft Azure | Microsoft Azure Data Market Free DataSets | http://datamarket.azure.com/browse/data?price=free | Public Domains | NA | ||
Open Library | Open Library Data Dumps | https://openlibrary.org/developers/dumps | Public Domains | NA | ||
Reddit Datasets | Reddit Datasets | https://www.reddit.com/r/datasets | Public Domains | NA | ||
RevolutionAnalytics Collection | RevolutionAnalytics Collection | http://packages.revolutionanalytics.com/datasets/ | Public Domains | NA | ||
Sample R | Sample R data sets | http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html | Public Domains | NA | ||
StatSci.org | StatSci.org | http://www.statsci.org/datasets.html | Public Domains | NA | ||
The Washington | The Washington Post List | http://www.washingtonpost.com/wp-srv/metro/data/datapost.html | Public Domains | NA | ||
UCLA SOCR | UCLA SOCR data collection | http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data | Public Domains | NA | ||
UFO Reports | UFO Reports | http://www.nuforc.org/webreports.html | Public Domains | NA | ||
Yahoo Webscope | Yahoo Webscope | http://webscope.sandbox.yahoo.com/catalog.php | Public Domains | NA | ||
Academic Torrents | Academic Torrents of data sharing from UMB | http://academictorrents.com/browse.php | Data Catalog | Search Engines | NA | |
Qlik | DataMarket (Qlik) | https://datamarket.com/data/list/?q=all | Search Engines | NA | ||
Harvard Dataverse | Harvard Dataverse Network of scientific data | https://dataverse.harvard.edu/ | Search Engines | NA | ||
UMICH | ICPSR (UMICH) | http://www.icpsr.umich.edu/icpsrweb/ICPSR/index.jsp | Search Engines | NA | ||
Institute of | Institute of Education Sciences | http://eric.ed.gov | Search Engines | NA | ||
National Technical | National Technical Reports Library | http://www.ntis.gov/products/ntrl/ | Search Engines | NA | ||
beta | Open Data Certificates (beta) | https://certificates.theodi.org/en/datasets | Search Engines | NA | ||
OpenDataNetwork | OpenDataNetwork - A search engine of all Socrata powered data portals | http://www.opendatanetwork.com/ | Data Explorer | Search Engines | NA | |
Statista.com | Statista.com - statistics and Studies | http://www.statista.com/ | Data Catalog | Search Engines | NA | |
Zenodo | Zenodo - An open dependable home for the long-tail of science | https://zenodo.org/collection/datasets | Search Engines | NA | ||
Ancestry.com Forum | Ancestry.com Forum Dataset over 10 years | http://www.cs.cmu.edu/~jelsas/data/ancestry.com/ | Social Networks | NA | ||
CMU Enron | CMU Enron Email of 150 users | http://www.cs.cmu.edu/~enron/ | Social Networks | NA | ||
GitHub Collaboration | GitHub Collaboration Archive | https://www.githubarchive.org/ | Social Networks | NA | ||
Google Scholar | Google Scholar citation relations | http://www3.cs.stonybrook.edu/~leman/data/gscholar.db | Social Networks | NA | ||
High-Resolution Contact | High-Resolution Contact Networks from Wearable Sensors | http://www.sociopatterns.org/datasets/ | Social Networks | NA | ||
Mobile Social | Mobile Social Networks from UMASS | https://kdl.cs.umass.edu/display/public/Mobile+Social+Networks | Social Networks | NA | ||
Network Twitter | Network Twitter Data | http://snap.stanford.edu/data/higgs-twitter.html | Social Networks | NA | ||
Reddit Comments | Reddit Comments | https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/ | Social Networks | NA | ||
Social Twitter | Social Twitter Data | http://snap.stanford.edu/data/egonets-Twitter.html | Social Networks | NA | ||
SourceForge.net Research | SourceForge.net Research Data | http://www3.nd.edu/~oss/Data/data.html | Social Networks | NA | ||
Twitter Graph | Twitter Graph of entire Twitter site | http://an.kaist.ac.kr/traces/WWW2010.html | Social Networks | NA | ||
UNIMI/LAW Social | UNIMI/LAW Social Network Datasets | http://law.di.unimi.it/datasets.php | Social Networks | NA | ||
Yahoo! Graph | Yahoo! Graph and Social Data | http://webscope.sandbox.yahoo.com/catalog.php?datatype=g | Social Networks | NA | ||
Armed Conflict Location & Event Data Project | ACLED (Armed Conflict Location & Event Data Project) | http://www.acleddata.com/ | Social Sciences | NA | ||
Canadian Legal | Canadian Legal Information Institute | https://www.canlii.org/en/index.php | Social Sciences | NA | ||
Center for Systemic Peace Datasets | Center for Systemic Peace Datasets - Conflict Trends, Polities, State Fragility, etc | http://www.systemicpeace.org/ | Social Sciences | NA | ||
Correlates of | Correlates of War Project | http://www.correlatesofwar.org/ | Social Sciences | NA | ||
Cryptome Conspiracy | Cryptome Conspiracy Theory Items | http://cryptome.org | Social Sciences | NA | ||
Datacards | Datacards | http://datacards.org | Social Sciences | NA | ||
European Social | European Social Survey | http://www.europeansocialsurvey.org/data/ | Social Sciences | NA | ||
GDELT Global | GDELT Global Events Database | http://gdeltproject.org/data.html | Social Sciences | NA | ||
German Social | German Social Survey | http://www.gesis.org/en/home/ | Social Sciences | NA | ||
Global Religious | Global Religious Futures Project | http://www.globalreligiousfutures.org/ | Social Sciences | NA | ||
Humanitarian Data | Humanitarian Data Exchange | https://data.hdx.rwlabs.org/ | Social Sciences | NA | ||
Institute for | Institute for Demographic Studies | http://www.ined.fr/en/ | Social Sciences | NA | ||
International Networks | International Networks Archive | http://www.princeton.edu/~ina/ | Social Sciences | NA | ||
International Social | International Social Survey Program ISSP | http://www.issp.org | Social Sciences | NA | ||
International Studies | International Studies Compendium Project | http://www.isacompendium.com/public/ | Social Sciences | NA | ||
James McGuire | James McGuire Cross National Data | http://jmcguire.faculty.wesleyan.edu/welcome/cross-national-data/ | Social Sciences | NA | ||
MacroData Guide | MacroData Guide by Norsk samfunnsvitenskapelig datatjeneste | http://nsd.uib.no | Social Sciences | NA | ||
Minnesota Population | Minnesota Population Center | https://www.ipums.org/ | Social Sciences | NA | ||
MIT Reality | MIT Reality Mining Dataset | http://realitycommons.media.mit.edu/realitymining.html | Social Sciences | NA | ||
Open Crime | Open Crime and Policing Data in England, Wales and Northern Ireland | https://data.police.uk/data/ | Social Sciences | NA | ||
Paul Hensel | Paul Hensel General International Data Page | http://www.paulhensel.org/dataintl.html | Social Sciences | NA | ||
PewResearch Internet | PewResearch Internet Survey Project | http://www.pewinternet.org/datasets/pages/2/ | Social Sciences | NA | ||
PewResearch Society | PewResearch Society Data Collection | http://www.pewresearch.org/data/download-datasets/ | Social Sciences | NA | ||
Political Polarity | Political Polarity Data | http://www3.cs.stonybrook.edu/~leman/data/14-icwsm-political-polarity-data.zip | Social Sciences | NA | ||
StackExchange Data | StackExchange Data Explorer | http://data.stackexchange.com/help | Social Sciences | NA | ||
Terrorism Research | Terrorism Research and Analysis Consortium | http://www.trackingterrorism.org/ | Social Sciences | NA | ||
D-Lab | UCB's Archive of Social Science Data (D-Lab) | http://ucdata.berkeley.edu/ | Social Sciences | NA | ||
Uppsala Conflict | Uppsala Conflict Data Program | http://ucdp.uu.se/ | Social Sciences | NA | ||
UCLA Social | UCLA Social Sciences Data Archive | http://dataarchives.ss.ucla.edu/Home.DataPortals.htm | Social Sciences | NA | ||
UN Civil | UN Civil Society Database | http://esango.un.org/civilsociety/ | Social Sciences | NA | ||
Universities Worldwide | Universities Worldwide | http://univ.cc/ | Social Sciences | NA | ||
UPJOHN for | UPJOHN for Labor Employment Research | http://www.upjohn.org/services/resources/employment-research-data-center | Social Sciences | NA | ||
World Bank | World Bank Data | http://data.worldbank.org/ | Social Sciences | NA | ||
WorldPop project | WorldPop project - Worldwide human population distributions | http://www.worldpop.org.uk/data/get_data/ | Social Sciences | NA | ||
FLOSSmole data | FLOSSmole data about free, libre, and open source software development | http://flossdata.syr.edu/data/ | Software | NA | ||
NBA/NCAA/Euro | Basketball (NBA/NCAA/Euro) Player Database and Statistics | http://www.draftexpress.com/stats.php | Sports | NA | ||
Betfair Historical | Betfair Historical Exchange Data | http://data.betfair.com/ | Sports | NA | ||
cricket | Cricsheet Matches (cricket) | http://cricsheet.org/ | Sports | NA | ||
data and APIs | Football/Soccer resources (data and APIs) | http://www.jokecamp.com/blog/guide-to-football-and-soccer-data-and-apis/ | Sports | NA | ||
Lahman's Baseball | Lahman's Baseball Database | http://www.seanlahman.com/baseball-archive/statistics/ | Sports | NA | ||
Retrosheet Baseball | Retrosheet Baseball Statistics | http://www.retrosheet.org/game.htm | Sports | NA | ||
Databanks International | Databanks International Cross National Time Series Data Archive | http://www.cntsdata.com | Time Series | NA | ||
Hard Drive | Hard Drive Failure Rates | https://www.backblaze.com/hard-drive-test-data.html | Time Series | NA | ||
Heart Rate | Heart Rate Time Series from MIT | http://ecg.mit.edu/time-series/ | Time Series | NA | ||
TSDL | Time Series Data Library (TSDL) from MU | https://datamarket.com/data/list/?q=provider:tsdl | Time Series | NA | ||
UC Riverside | UC Riverside Time Series Dataset | http://www.cs.ucr.edu/~eamonn/time_series_data/ | Time Series | NA | ||
Bay Area | Bay Area Bike Share Data | http://www.bayareabikeshare.com/open-data | Transportation | NA | ||
GeoLife GPS | GeoLife GPS Trajectory from Microsoft Research | http://research.microsoft.com/en-us/downloads/b16d359d-d164-469e-9fd4-daa38f2b2e13/ | Transportation | NA | ||
German train | German train system by Deutsche Bahn | http://data.deutschebahn.com/datasets/ | Transportation | NA | ||
Hubway Million | Hubway Million Rides in MA | http://hubwaydatachallenge.org/trip-history-data/ | Transportation | NA | ||
Marine Traffic | Marine Traffic - ship tracks, port calls and more | http://www.marinetraffic.com/de/ais-api-services | Transportation | NA | ||
Montreal BIXI | Montreal BIXI Bike Share | https://montreal.bixi.com/donn%C3%A9es-libre-service | Transportation | NA | ||
OpenFlights | OpenFlights - airport, airline and route data | http://openflights.org/data.html | Transportation | NA | ||
JSON | Philadelphia Bike Share Stations (JSON) | https://www.rideindego.com/stations/json/ | Transportation | NA | ||
RITA Airline | RITA Airline On-Time Performance data | http://www.transtats.bts.gov/Tables.asp?DB_ID=120 | Transportation | NA | ||
TranStat | RITA/BTS transport data collection (TranStat) | http://www.transtats.bts.gov/DataIndex.asp | Transportation | NA | ||
XML file | Toronto Bike Share Stations (XML file) | http://www.bikesharetoronto.com/data/stations/bikeStations.xml | Transportation | NA | ||
TFL | Transport for London (TFL) | https://tfl.gov.uk/info-for/open-data-users/data-feeds | Transportation | NA | ||
TTS | Travel Tracker Survey (TTS) for Chicago | http://www.cmap.illinois.gov/data/transportation/travel-tracker-survey | Transportation | NA | ||
BTS | U.S. Bureau of Transportation Statistics (BTS) | http://www.rita.dot.gov/bts/ | Transportation | NA | ||
Database of | Database of Scientific Code Contributions | https://mozillascience.org/collaborate | Complementary Collections | NA | ||
Open Data Monitor | Open Data Monitor | http://opendatamonitor.eu | Data Catalog | Complementary Collections | NA | |
Quora Answer Datasets | Quora answers to large public datasets question | http://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public | Complementary Collections | NA | ||
100 Statistics Datasets | 100+ Interesting Data Sets for Statistics | http://rs.io/100-interesting-data-sets-for-statistics/ | Complementary Collections | NA |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment