This package represents a community effort to provide a common interface for accessing common Machine Learning (ML) datasets. In contrast to other data-related Julia packages, the focus of MLDatasets.jl is specifically on downloading, unpacking, and accessing benchmark datasets. Functionality for the purpose of data processing or visualization is only provided to a degree that is special to some datasets.

Features

  • Datasets are grouped into different categories
  • The way MLDatasets.jl is organized is that each dataset is its own type
  • Datasets with an underlying graph structure: Cora, PubMed, CiteSeer
  • Datasets that do not fall into any of the other categories: Iris, BostonHousing
  • Datasets for language models
  • Vision related datasets such as MNIST, CIFAR10, CIFAR100

Project Samples

Project Activity

See All Activity >