Table Of Contents
Table Of Contents

LibSVMIter

mxnet.io.LibSVMIter(*args, **kwargs)

b”Returns the LibSVM iterator which returns data with csrnstorage type. This iterator is experimental and should be used with care.nnThe input data is stored in a format similar to LibSVM file format, except that the indicesnare expected to be zero-based instead of one-based, and the column indices for each row arenexpected to be sorted in ascending order. Details of the LibSVM format are availablen`here. <https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/>`_nnnThe data_shape parameter is used to set the shape of each line of the data.nThe dimension of both data_shape and label_shape are expected to be 1.nnThe data_libsvm parameter is used to set the path input LibSVM file.nWhen it is set to a directory, all the files in the directory will be read.nnWhen label_libsvm is set to NULL, both data and label are read from the file specifiednby data_libsvm. In this case, the data is stored in csr storage type, while the label is a 1Dndense array.nnThe LibSVMIter only support round_batch parameter set to True. Therefore, if batch_sizenis 3 and there are 4 total rows in libsvm file, 2 more examples are consumed at the first round.nnWhen num_parts and part_index are provided, the data is split into num_parts partitions,nand the iterator only reads the part_index-th partition. However, the partitions are notnguaranteed to be even.nn``reset()`` is expected to be called only after a complete pass of data.nnExample::nn # Contents of libsvm file data.t.n 1.0 0:0.5 2:1.2n -2.0n -3.0 0:0.6 1:2.4 2:1.2n 4 2:-1.2nn # Creates a LibSVMIter with batch_size`=3.n >>> data_iter = mx.io.LibSVMIter(data_libsvm = ‘data.t’, data_shape = (3,), batch_size = 3)n # The data of the first batch is stored in csr storage typen >>> batch = data_iter.next()n >>> csr = batch.data[0]n <CSRNDArray 3x3 @cpu(0)>n >>> csr.asnumpy()n [[ 0.5 0. 1.2 ]n [ 0. 0. 0. ]n [ 0.6 2.4 1.2]]n # The label of first batchn >>> label = batch.label[0]n >>> labeln [ 1. -2. -3.]n <NDArray 3 @cpu(0)>nn >>> second_batch = data_iter.next()n # The data of the second batchn >>> second_batch.data[0].asnumpy()n [[ 0. 0. -1.2 ]n [ 0.5 0. 1.2 ]n [ 0. 0. 0. ]]n # The label of the second batchn >>> second_batch.label[0].asnumpy()n [ 4. 1. -2.]nn >>> data_iter.reset()n # To restart the iterator for the second pass of the datannWhen `label_libsvm is set to the path to another LibSVM file,ndata is read from data_libsvm and label from label_libsvm.nIn this case, both data and label are stored in the csr format.nIf the label column in the data_libsvm file is ignored.nnExample::nn # Contents of libsvm file label.tn 1.0n -2.0 0:0.125n -3.0 2:1.2n 4 1:1.0 2:-1.2nn # Creates a LibSVMIter with specified label filen >>> data_iter = mx.io.LibSVMIter(data_libsvm = ‘data.t’, data_shape = (3,),n label_libsvm = ‘label.t’, label_shape = (3,), batch_size = 3)nn # Both data and label are in csr storage typen >>> batch = data_iter.next()n >>> csr_data = batch.data[0]n <CSRNDArray 3x3 @cpu(0)>n >>> csr_data.asnumpy()n [[ 0.5 0. 1.2 ]n [ 0. 0. 0. ]n [ 0.6 2.4 1.2 ]]n >>> csr_label = batch.label[0]n <CSRNDArray 3x3 @cpu(0)>n >>> csr_label.asnumpy()n [[ 0. 0. 0. ]n [ 0.125 0. 0. ]n [ 0. 0. 1.2 ]]nnnnDefined in src/io/iter_libsvm.cc:L298”

Parameters:
  • data_libsvm (string, required) – The input zero-base indexed LibSVM data file or a directory path.
  • data_shape (Shape(tuple), required) – The shape of one example.
  • label_libsvm (string, optional, default='NULL') – The input LibSVM label file or a directory path. If NULL, all labels will be read from data_libsvm.
  • label_shape (Shape(tuple), optional, default=[1]) – The shape of one label.
  • num_parts (int, optional, default='1') – partition the data into multiple parts
  • part_index (int, optional, default='0') – the index of the part will read
  • batch_size (int (non-negative), required) – Batch size.
  • round_batch (boolean, optional, default=1) – Whether to use round robin to handle overflow batch or not.
  • prefetch_buffer (long (non-negative), optional, default=4) – Maximum number of batches to prefetch.
  • dtype ({None, 'float16', 'float32', 'float64', 'int32', 'int64', 'uint8'},optional, default='None') – Output data type. None means no change.
Returns:

The result iterator.

Return type:

MXDataIter