在深度学习里，尤其是计算机视觉里，很多时候训练集不够充足，或者某一类的数据较少，为了增加训练数据的数量，防止过度拟合，增加模型的鲁棒性，数据增强，data augmentation就应运而生

常见的两种颜色空间

RGB颜色空间
HSV颜色空间

1.RGB颜色空间

我们生活中常见的显色方式，采用R、G、B三种颜色相加混色的原理， R代表red红色，G代表green绿色，B代表blue蓝色

2.HSV颜色空间

HSV在图像处理里的指定颜色分割有较大的作用

H代表hue色相，也就是什么颜色

S代表saturation饱和度，也就是颜色的深浅

V代表value明度，也就是颜色的明亮程度

有时HSV也会被叫做HSB(brightness)

常见的几种方式

图像常用表达格式有两种，RBG（红蓝绿）, HSV（色相-饱和度-明度）

图像翻转 image flip
图像缩放 image scale
图像模糊 image blur
图像明亮度变换 image bright
图像色相变换 image hue
图像饱和度变换 image saturation
图像平移变换 image shift
图像裁剪 image crop

[^boxes]: torch.FloatTensor(), nx4
[^labels]: torch.LongTensor(), 1xn

1. 图像翻转 image flip

通过图像水平翻转来实现图像增加

Imgur

def random_flip(selfs, im, boxes):
        if random.random() < 0.5:
            im_fliped = np.fliplr(im)
            h, w, _ = im.shape
            xmin = w - boxes[:, 2]
            xmax = w- boxes[:, 0]
            boxes[:, 0] = xmin
            boxes[:, 2] = xmax
            im = im_fliped
        return im, boxes

2. 图像缩放 image scale

Imgur

def random_scale(im, boxes):
    if random.random() < 0.5:
        print('here')
        scale = random.uniform(0.8, 2.3)
        height, width, c = im.shape
        im = cv2.resize(im, (int(width*scale), height))
        scale_tensor = torch.FloatTensor([scale, 1, scale, 1]).expand_as(boxes)
        boxes = boxes * scale_tensor
    return im, boxes

3. 图像模糊 image blur

def random_blur(im):
    if random.random() < 0.5:
        im = cv2.blur(im, (5,5))
    return im

4. 图像明亮度变换 image bright

Imgur

def random_bright(im):
        if random.random()<0.5:
            hsv = BGR2HSV(im)
            h,s,v = cv2.split(hsv)
            adjust = random.choice([0.5, 1.5])
            v = v*adjust
            v = np.clip(v, 0, 255).astype(hsv.dtype)
            hsv = cv2.merge((h,s,v))
            im = HSV2BGR(hsv)
        return im

5. 图像色相变换 image hue

Imgur

def random_hue(im):
    if random.random()<0.5:
        hsv = BGR2HSV(im)
        h,s,v = cv2.split(hsv)
        adjust = random.choice([0.5,1.5])
        h = h * adjust
        h = np.clip(h, 0, 255).astype(hsv.dtype)
        hsv = cv2.merge((h, s, v))
        im = HSV2BGR(hsv)
    return im

6. 图像饱和度变换 image saturation

Imgur

def random_saturation(im):
    if random.random()<0.5:
        hsv = BGR2HSV(im)
        h,s,v = cv2.split(hsv)
        adjust = random.choice([0.5,1.5])
        s = s * adjust
        s = np.clip(s, 0, 255).astype(hsv.dtype)
        hsv = cv2.merge((h, s, v))
        im = HSV2BGR(hsv)
    return im

7. 图像平移变换 image shift

Imgur

def random_shift(im, boxes, labels):
    center = (boxes[:,:2]+boxes[:,2:])/2
    if random.random():
        height, width, c = im.shape
        # print(im.dtype)
        shifted_image = np.zeros((height,width,c), dtype=im.dtype)
        shifted_x = random.uniform(-width*0.2, width*0.2)
        shifted_y = random.uniform(-height*0.2, height*0.2)
        # shifting image
        if shifted_x>=0 and shifted_y>=0:
            shifted_image[int(shifted_y):,int(shifted_x):,:] = im[:height-int(shifted_y),:width-int(shifted_x),:]
        elif shifted_x>=0 and shifted_y<0:
            shifted_image[:height+int(shifted_y),int(shifted_x):,:] = im[-int(shifted_y):,:width-int(shifted_x),:]
        elif shifted_x <0 and shifted_y >=0:
            shifted_image[int(shifted_y):,:width+int(shifted_x),:] = im[:height-int(shifted_y),-int(shifted_x):,:]
        elif shifted_x<0 and shifted_y<0:
            shifted_image[:height+int(shifted_y),:width+int(shifted_x),:] = im[-int(shifted_y):,-int(shifted_x):,:]

        shift_xy = torch.FloatTensor([[int(shifted_x),int(shifted_y)]]).expand_as(center)
        center = center + shift_xy
        mask1 = (center[:,0] >0) & (center[:,0] < width)
        mask2 = (center[:,1] >0) & (center[:,1] < height)
        mask = (mask1 & mask2).view(-1,1)
        boxes_in = boxes[mask.expand_as(boxes)].view(-1,4)
        if len(boxes_in) == 0:
            return im, boxes, labels
        box_shift = torch.FloatTensor([[int(shifted_x),int(shifted_y),int(shifted_x),int(shifted_y)]]).expand_as(boxes_in)
        boxes_in = boxes_in+box_shift
        labels_in = labels[mask.view(-1)]
        return shifted_image,boxes_in,labels_in
    return im, boxes, labels

8. 图像裁剪 image crop

Imgur

def random_crop(im, boxes, labels):
    if random.random() < 0.5:
        center = (boxes[:, 2:] + boxes[:, :2]) / 2
        height, width, c = im.shape
        h = random.uniform(0.6 * height, height)
        w = random.uniform(0.6 * width, width)
        x = random.uniform(0, width - w)
        y = random.uniform(0, height - h)
        x, y, h, w = int(x), int(y), int(h), int(w)

        center = center - torch.FloatTensor([[x, y]]).expand_as(center)
        mask1 = (center[:, 0] > 0) & (center[:, 0] < w)
        mask2 = (center[:, 1] > 0) & (center[:, 1] < h)
        mask = (mask1 & mask2).view(-1, 1)

        boxes_in = boxes[mask.expand_as(boxes)].view(-1, 4)
        if (len(boxes_in) == 0):
            return im, boxes, labels
        box_shift = torch.FloatTensor([[x, y, x, y]]).expand_as(boxes_in)

        boxes_in = boxes_in - box_shift
        boxes_in[:, 0] = boxes_in[:, 0].clamp_(min=0, max=w)
        boxes_in[:, 2] = boxes_in[:, 2].clamp_(min=0, max=w)
        boxes_in[:, 1] = boxes_in[:, 1].clamp_(min=0, max=h)
        boxes_in[:, 3] = boxes_in[:, 3].clamp_(min=0, max=h)

        labels_in = labels[mask.view(-1)]
        img_croped = im[y:y + h, x:x + w, :]
        return img_croped, boxes_in, labels_in
    return im, boxes, labels