Table Of Contents
Table Of Contents

DataLoader

class mxnet.gluon.data.DataLoader(dataset, batch_size=None, shuffle=False, sampler=None, last_batch=None, batch_sampler=None, batchify_fn=None, num_workers=0, pin_memory=False)[source]

Loads data from a dataset and returns mini-batches of data.

Parameters:
  • dataset (Dataset) – Source dataset. Note that numpy and mxnet arrays can be directly used as a Dataset.
  • batch_size (int) – Size of mini-batch.
  • shuffle (bool) – Whether to shuffle the samples.
  • sampler (Sampler) – The sampler to use. Either specify sampler or shuffle, not both.
  • last_batch ({'keep', 'discard', 'rollover'}) –

    How to handle the last batch if batch_size does not evenly divide len(dataset).

    keep - A batch with less samples than previous batches is returned. discard - The last batch is discarded if its incomplete. rollover - The remaining samples are rolled over to the next epoch.

  • batch_sampler (Sampler) – A sampler that returns mini-batches. Do not specify batch_size, shuffle, sampler, and last_batch if batch_sampler is specified.
  • batchify_fn (callable) –

    Callback function to allow users to specify how to merge samples into a batch. Defaults to default_batchify_fn:

    def default_batchify_fn(data):
        if isinstance(data[0], nd.NDArray):
            return nd.stack(*data)
        elif isinstance(data[0], tuple):
            data = zip(*data)
            return [default_batchify_fn(i) for i in data]
        else:
            data = np.asarray(data)
            return nd.array(data, dtype=data.dtype)
    
  • num_workers (int, default 0) – The number of multiprocessing workers to use for data preprocessing.
  • pin_memory (boolean, default False) – If True, the dataloader will copy NDArrays into pinned memory before returning them. Copying from CPU pinned memory to GPU is faster than from normal CPU memory.
__init__(dataset, batch_size=None, shuffle=False, sampler=None, last_batch=None, batch_sampler=None, batchify_fn=None, num_workers=0, pin_memory=False)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(dataset[, batch_size, shuffle, …]) Initialize self.