Transforms¶
Transforms are common image transforms. They can be chained together using Compose
-
class
torchvision.transforms.
Compose
(transforms)[source]¶ Composes several transforms together.
Parameters: transforms (list of Transform
objects) – list of transforms to compose.Example
>>> transforms.Compose([ >>> transforms.CenterCrop(10), >>> transforms.ToTensor(), >>> ])
Transforms on PIL Image¶
-
class
torchvision.transforms.
Resize
(size, interpolation=2)[source]¶ Resize the input PIL Image to the given size.
Parameters: - size (sequence or int) – Desired output size. If size is a sequence like (h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size)
- interpolation (int, optional) – Desired interpolation. Default is
PIL.Image.BILINEAR
-
class
torchvision.transforms.
Scale
(*args, **kwargs)[source]¶ Note: This transform is deprecated in favor of Resize.
-
class
torchvision.transforms.
CenterCrop
(size)[source]¶ Crops the given PIL Image at the center.
Parameters: size (sequence or int) – Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made.
-
class
torchvision.transforms.
RandomCrop
(size, padding=0)[source]¶ Crop the given PIL Image at a random location.
Parameters: - size (sequence or int) – Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made.
- padding (int or sequence, optional) – Optional padding on each border of the image. Default is 0, i.e no padding. If a sequence of length 4 is provided, it is used to pad left, top, right, bottom borders respectively.
-
class
torchvision.transforms.
RandomHorizontalFlip
[source]¶ Horizontally flip the given PIL Image randomly with a probability of 0.5.
-
class
torchvision.transforms.
RandomVerticalFlip
[source]¶ Vertically flip the given PIL Image randomly with a probability of 0.5.
-
class
torchvision.transforms.
RandomResizedCrop
(size, interpolation=2)[source]¶ Crop the given PIL Image to random size and aspect ratio.
A crop of random size of (0.08 to 1.0) of the original size and a random aspect ratio of 3/4 to 4/3 of the original aspect ratio is made. This crop is finally resized to given size. This is popularly used to train the Inception networks.
Parameters: - size – expected output size of each edge
- interpolation – Default: PIL.Image.BILINEAR
-
class
torchvision.transforms.
RandomSizedCrop
(*args, **kwargs)[source]¶ Note: This transform is deprecated in favor of RandomResizedCrop.
-
class
torchvision.transforms.
FiveCrop
(size)[source]¶ Crop the given PIL Image into four corners and the central crop.abs
Note: this transform returns a tuple of images and there may be a mismatch in the number of inputs and targets your Dataset returns.
Parameters: size (sequence or int) – Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made.
-
class
torchvision.transforms.
TenCrop
(size, vertical_flip=False)[source]¶ Crop the given PIL Image into four corners and the central crop plus the flipped version of these (horizontal flipping is used by default)
Note: this transform returns a tuple of images and there may be a mismatch in the number of inputs and targets your Dataset returns.
Parameters:
-
class
torchvision.transforms.
Pad
(padding, fill=0)[source]¶ Pad the given PIL Image on all sides with the given “pad” value.
Parameters: - padding (int or tuple) – Padding on each border. If a single int is provided this is used to pad all borders. If tuple of length 2 is provided this is the padding on left/right and top/bottom respectively. If a tuple of length 4 is provided this is the padding for the left, top, right and bottom borders respectively.
- fill – Pixel fill value. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively.
-
class
torchvision.transforms.
ColorJitter
(brightness=0, contrast=0, saturation=0, hue=0)[source]¶ Randomly change the brightness, contrast and saturation of an image.
Parameters: - brightness (float) – How much to jitter brightness. brightness_factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness].
- contrast (float) – How much to jitter contrast. contrast_factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast].
- saturation (float) – How much to jitter saturation. saturation_factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation].
- hue (float) – How much to jitter hue. hue_factor is chosen uniformly from [-hue, hue]. Should be >=0 and <= 0.5.
Transforms on torch.*Tensor¶
-
class
torchvision.transforms.
Normalize
(mean, std)[source]¶ Normalize an tensor image with mean and standard deviation.
Given mean: (R, G, B) and std: (R, G, B), will normalize each channel of the torch.*Tensor, i.e. channel = (channel - mean) / std
Parameters: - mean (sequence) – Sequence of means for R, G, B channels respecitvely.
- std (sequence) – Sequence of standard deviations for R, G, B channels respecitvely.
Conversion Transforms¶
-
class
torchvision.transforms.
ToTensor
[source]¶ Convert a
PIL Image
ornumpy.ndarray
to tensor.Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].
-
__call__
(pic)[source]¶ Parameters: pic (PIL Image or numpy.ndarray) – Image to be converted to tensor. Returns: Converted image. Return type: Tensor
-
-
class
torchvision.transforms.
ToPILImage
[source]¶ Convert a tensor or an ndarray to PIL Image.
Converts a torch.*Tensor of shape C x H x W or a numpy ndarray of shape H x W x C to a PIL Image while preserving the value range.
-
__call__
(pic)[source]¶ Parameters: pic (Tensor or numpy.ndarray) – Image to be converted to PIL Image. Returns: Image converted to PIL Image. Return type: PIL Image
-