Table Of Contents
Table Of Contents

CSVIter

mxnet.io.CSVIter(*args, **kwargs)

b”Returns the CSV file iterator.nnIn this function, the data_shape parameter is used to set the shape of each line of the input data.nIf a row in an input file is 1,2,3,4,5,6` and data_shape is (3,2), that rownwill be reshaped, yielding the array [[1,2],[3,4],[5,6]] of shape (3,2).nnBy default, the CSVIter has round_batch parameter set to True. So, if batch_sizenis 3 and there are 4 total rows in CSV file, 2 more examplesnare consumed at the first round. If reset function is called after first round,nthe call is ignored and remaining examples are returned in the second round.nnIf one wants all the instances in the second round after calling reset, make surento set round_batch to False.nnIf data_csv = 'data/' is set, then all the files in this directory will be read.nn``reset()`` is expected to be called only after a complete pass of data.nnBy default, the CSVIter parses all entries in the data file as float32 data type,nif dtype argument is set to be ‘int32’ or ‘int64’ then CSVIter will parse all entries in the filenas int32 or int64 data type accordingly.nnExamples::nn // Contents of CSV file data/data.csv.n 1,2,3n 2,3,4n 3,4,5n 4,5,6nn // Creates a CSVIter with batch_size`=2 and default `round_batch`=True.n CSVIter = mx.io.CSVIter(data_csv = ‘data/data.csv’, data_shape = (3,),n batch_size = 2)nn // Two batches read from the above iterator are as follows:n [[ 1. 2. 3.]n [ 2. 3. 4.]]n [[ 3. 4. 5.]n [ 4. 5. 6.]]nn // Creates a `CSVIter with default round_batch set to True.n CSVIter = mx.io.CSVIter(data_csv = ‘data/data.csv’, data_shape = (3,),n batch_size = 3)nn // Two batches read from the above iterator in the first pass are as follows:n [[1. 2. 3.]n [2. 3. 4.]n [3. 4. 5.]]nn [[4. 5. 6.]n [1. 2. 3.]n [2. 3. 4.]]nn // Now, reset method is called.n CSVIter.reset()nn // Batch read from the above iterator in the second pass is as follows:n [[ 3. 4. 5.]n [ 4. 5. 6.]n [ 1. 2. 3.]]nn // Creates a CSVIter with round_batch`=False.n CSVIter = mx.io.CSVIter(data_csv = ‘data/data.csv’, data_shape = (3,),n batch_size = 3, round_batch=False)nn // Contents of two batches read from the above iterator in both passes, after callingn // `reset method before second pass, is as follows:n [[1. 2. 3.]n [2. 3. 4.]n [3. 4. 5.]]nn [[4. 5. 6.]n [2. 3. 4.]n [3. 4. 5.]]nn // Creates a ‘CSVIter’ with dtype`=’int32’n CSVIter = mx.io.CSVIter(data_csv = ‘data/data.csv’, data_shape = (3,),n batch_size = 3, round_batch=False, dtype=’int32’)nn // Contents of two batches read from the above iterator in both passes, after callingn // `reset method before second pass, is as follows:n [[1 2 3]n [2 3 4]n [3 4 5]]nn [[4 5 6]n [2 3 4]n [3 4 5]]nnnnDefined in src/io/iter_csv.cc:L308”

Parameters:
  • data_csv (string, required) – The input CSV file or a directory path.
  • data_shape (Shape(tuple), required) – The shape of one example.
  • label_csv (string, optional, default='NULL') – The input CSV file or a directory path. If NULL, all labels will be returned as 0.
  • label_shape (Shape(tuple), optional, default=[1]) – The shape of one label.
  • batch_size (int (non-negative), required) – Batch size.
  • round_batch (boolean, optional, default=1) – Whether to use round robin to handle overflow batch or not.
  • prefetch_buffer (long (non-negative), optional, default=4) – Maximum number of batches to prefetch.
  • dtype ({None, 'float16', 'float32', 'float64', 'int32', 'int64', 'uint8'},optional, default='None') – Output data type. None means no change.
Returns:

The result iterator.

Return type:

MXDataIter