Data¶
This module includes classes for loading and prefetching data batches.
Example usage:
import image_tool
from PIL import Image
tool = image_tool.ImageTool()
def image_transform(img_path):
global tool
return tool.load(img_path).resize_by_range(
(112, 128)).random_crop(
(96, 96)).flip().get()
data = ImageBatchIter('train.txt', 3,
image_transform, shuffle=True, delimiter=',',
image_folder='images/',
capacity=10)
data.start()
# imgs is a numpy array for a batch of images,
# shape: batch_size, 3 (RGB), height, width
imgs, labels = data.next()
# convert numpy array back into images
for idx in range(imgs.shape[0]):
img = Image.fromarray(imgs[idx].astype(np.uint8).transpose(1, 2, 0),
'RGB')
img.save('img%d.png' % idx)
data.end()
-
class
singa.data.
ImageBatchIter
(img_list_file, batch_size, image_transform, shuffle=True, delimiter=' ', image_folder=None, capacity=10)¶ Utility for iterating over an image dataset to get mini-batches.
- Parameters
img_list_file (str) – name of the file containing image meta data; each line consists of image_path_suffix delimiter meta_info, where meta info could be label index or label strings, etc. meta_info should not contain the delimiter. If the meta_info of each image is just the label index, then we will parse the label index into a numpy array with length=batchsize (for compatibility); otherwise, we return a list of meta_info; if meta info is available, we return a list of None.
batch_size (int) – num of samples in one mini-batch
image_transform – a function for image augmentation; it accepts the full image path and outputs a list of augmented images.
shuffle (boolean) – True for shuffling images in the list
delimiter (char) – delimiter between image_path_suffix and label, e.g., space or comma
image_folder (boolean) – prefix of the image path
capacity (int) – the max num of mini-batches in the internal queue.