Table Of Contents
Table Of Contents

Vision utilities.


datasets.MNIST([root, train, transform])

MNIST handwritten digits dataset from

datasets.FashionMNIST([root, train, transform])

A dataset of Zalando’s article images consisting of fashion products, a drop-in replacement of the original MNIST dataset from

datasets.CIFAR10([root, train, transform])

CIFAR10 image classification dataset from

datasets.CIFAR100([root, fine_label, train, …])

CIFAR100 image classification dataset from

datasets.ImageRecordDataset(filename[, …])

A dataset wrapping over a RecordIO file containing images.

datasets.ImageFolderDataset(root[, flag, …])

A dataset for loading image files stored in a folder structure.

Data transformations


Sequentially composes multiple transforms.


Cast input to a specific data type


Converts an image NDArray to a tensor NDArray.

transforms.Normalize(mean, std)

Normalize an tensor of shape (C x H x W) with mean and standard deviation.

transforms.RandomResizedCrop(size[, scale, …])

Crop the input image with random scale and aspect ratio.

transforms.CenterCrop(size[, interpolation])

Crops the image src to the given size by trimming on all four sides and preserving the center of the image.

transforms.Resize(size[, keep_ratio, …])

Resize an image to the given size.


Randomly flip the input image left to right with a probability of 0.5.


Randomly flip the input image top to bottom with a probability of 0.5.


Randomly jitters image brightness with a factor chosen from [max(0, 1 - brightness), 1 + brightness].


Randomly jitters image contrast with a factor chosen from [max(0, 1 - contrast), 1 + contrast].


Randomly jitters image saturation with a factor chosen from [max(0, 1 - saturation), 1 + saturation].


Randomly jitters image hue with a factor chosen from [max(0, 1 - hue), 1 + hue].

transforms.RandomColorJitter([brightness, …])

Randomly jitters the brightness, contrast, saturation, and hue of an image.


Add AlexNet-style PCA-based noise to an image.