首页 > 其他分享> 文章详细

人脸检测学习

2019-08-12 22:53:32 阅读：275 来源： 互联网

标签：Administrator conv 检测 Desktop 学习 boxes train 人脸 caffe

由于本人深度学习环境安装在windows上，因此下面是在windows系统上实现的。

注：参照唐宇迪视频教程。仅供自己学习记录。

使用caffe训练模型，首先需要准备数据。

正样本：对于人脸检测项目，正样本就是人脸的图片。制作正样本需要将人脸从图片中裁剪出来（数据源已经标注出人脸在图片中的坐标）。裁剪完成之后，需要check一下数据是否制作的没问题。

负样本：随机进行裁剪，使用IOU确定是正样本还是负样本。比如：IOU<0.3为负样本，最好是拿没有人脸的图片。

1、caffe数据源准备:

caffe支持LMDB数据，在训练模型时首先需要将训练集、验证集转换成LMDB数据。

首先需要准备两个txt文件：train.txt和test.txt。格式如下：

/path/to/folder/image_x.jpg 0 （即图片样本所在的路径和标签。文本后面的标签，对于二分类时，为0和1。本例中，0表示人脸数据，1表示非人脸数据。）

可以使用脚本来获取txt文档。简单写个脚本（获取train.txt）如下：

(txt文档中应该只需要相对路径，如train.txt的格式如下：xxxx.jpg label ，下面的代码有点问题)——2019年7月28日更新

import os

full_train_path = r"C:\Users\Administrator\Desktop\FaceDetection\train.txt"
full_val_path = r"C:\Users\Administrator\Desktop\FaceDetection\val.txt"

train_txt = open(full_train_path, 'w')
val_txt = open(full_val_path, 'w')

# get train.txt
for file in os.listdir(r"C:\Users\Administrator\Desktop\FaceDetection\train\train"):
    for figure in os.listdir(r"C:\Users\Administrator\Desktop\FaceDetection\train\train\\" + file):
        train_txt.writelines(file + r"/" + figure + " " + file + "\r\n")

train_txt.close()

#get val.txt
for val_file in os.listdir(r"C:\Users\Administrator\Desktop\FaceDetection\train\val"):
    if val_file.find("faceimage") != -1:
        val_txt.writelines(val_file + " " + "0" + "\r\n")
    else:
        val_txt.writelines(val_file + " " + "1" + "\r\n")

val_txt.close()

2、制作LMDB数据源：

分类问题使用LMDB数据，回归问题使用HDF5数据。

使用caffe自带的脚本文件，制作LMDB数据源。

convert_imageset使用格式：

convert_imageset --参数(如：resize、shuffle等) 数据源路径 数据源的txt 需要输出的lmdb路径。

cd C:\Program Files\caffe-windows\scripts\build\tools\Release
convert_imageset.exe --resize_height=227 --resize_width=227 --shuffle C:\Users\Administrator\Desktop\FaceDetection\train\train\ C:\Users\Administrator\Desktop\FaceDetection\train\train\train.txt C:\Users\Administrator\Desktop\FaceDetection\train_lmdb
convert_imageset.exe --resize_height=227 --resize_width=227 --shuffle C:\Users\Administrator\Desktop\FaceDetection\train\val\ C:\Users\Administrator\Desktop\FaceDetection\train\val\val.txt C:\Users\Administrator\Desktop\FaceDetection\val_lmdb

3、训练ALEXNET网络：

3.1配置caffe文件：

1、train.prototxt

配置caffe格式的ALEXNET网络结构。

2、solver.prototxt

①net：指定网络配置文件路径：

②test_iter：设置一次测试需要测试的batch数。最好是test_iter * batch_size = 样本总个数。

③base_lr：基础学习率。最终总的学习率为base_lr * lr_mult（train.prototxt中每一层指定的）。学习率不能太大，太大会

注：windows版本中，配置文件的路径使用“/”，如：source: "C:/Users/Administrator/Desktop/FaceDetection/train_lmdb"

3.2编写脚本，训练模型，得到模型（_iter_36000.caffemodel）：

cd C:\Program Files\caffe-windows\scripts\build\tools\Release
caffe.exe train --solver=C:\Users\Administrator\Desktop\FaceDetection\solver.prototxt

训练过程如下：

4、人脸识别检测算法框架：

4.1滑动窗口：

对输入的图片，画出不同的227*227的窗口（到目前为止，只支持固定大小的图片---卷积神经网络，最后的全连接层，参数固定。后面讲到全卷积网络，可以输入任意大小图片）。

为了检测不同尺寸图片中的人脸。需要进行多尺度scale变换。

FCN全卷积网络。得到heatmap，heatmap每一个点，代表了原图的每一个区域，其值为该区域是人脸的概率值。通过前向传播forward_all() ，得到heatmap。

设置阈值α，比如当α＞0.9时，保存框。这样的结果可能得到多个框。可以使用NMS（非极大值抑制）得到最终的一个框。

4.2将训练时的全连接的Alexnet网络进行转换成全卷积网络fcn的模型：

可以根据caffe官网示例操作：https://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/net_surgery.ipynb

首先需要将原先全连接网络的deploy.prototxt文件中的全连接层（InnerProduct）转换层卷积网络层（Convolution），并计算设定kernel size。

然后使用以下代码转换成全卷积网络fcn的模型（full_conv.caffemodel）

    net = caffe.Net(r"C:\Users\Administrator\Desktop\FaceDetection\deploy.prototxt",
                    r"C:\Users\Administrator\Desktop\FaceDetection\model2\_iter_36000.caffemodel",
                    caffe.TEST)
    params = ['fc6', 'fc7', 'fc8_flickr']

    fc_params = {pr: (net.params[pr][0].data, net.params[pr][1].data) for pr in params}

    for fc in params:
        print("{} weights are {} dimensional and biases are {} dimensional".format(fc, fc_params[fc][0].shape, fc_params[fc][1].shape))

    net_fully_conv = caffe.Net(r"C:\Users\Administrator\Desktop\FaceDetection\deploy_full_conv.prototxt",
                               r"C:\Users\Administrator\Desktop\FaceDetection\model2\_iter_36000.caffemodel",
                               caffe.TEST)
    params_fully_conv = ['fc6-conv', 'fc7-conv', 'fc8-conv']

    conv_params = {pr:(net_fully_conv.params[pr][0].data, net_fully_conv.params[pr][1].data) for pr in params_fully_conv}
    for conv in params_fully_conv:
        print("{} weights are {} dimensional and biases are {} dimensional".format(conv, conv_params[conv][0].shape, conv_params[conv][1].shape))

    for pr, pr_conv in zip(params, params_fully_conv):
        conv_params[pr_conv][0].flat = fc_params[pr][0].flat
        conv_params[pr_conv][1][...] = fc_params[pr][1]
    net_fully_conv.save(r"C:\Users\Administrator\Desktop\FaceDetection\full_conv.caffemodel")

4.3使用训练好的模型，编码实现人脸检测：

import os
import sys
import numpy as np
import math
import cv2
import random

caffe_root = r"C:\Program Files\caffe-windows\python"
sys.path.insert(0, caffe_root + 'python')
os.environ['GLOG_minloglevel'] = '2'
import caffe

class Point(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

class Rect(object):
    def __init__(self, p1, p2):
        """Store the top, bottom, left, right values for points
        p1, p2 are the left-top and right-bottom points of the rectangle"""
        self.left = min(p1.x, p2.x)
        self.right = max(p1.x, p2.x)
        self.bottom = min(p1.y, p2.y)
        self.top = max(p1.y, p2.y)

    def __str__(self):
        return "Rect[%d, %d, %d, %d]" %(self.left, self.top, self.right, self.bottom)

def calcDistance(x1, y1, x2, y2):
    dist = math.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2)
    return dist

def range_overlap(a_min, a_max, b_min, b_max):
    """Judge whether there is intersection on one dimension"""
    return (a_min <= b_max) and (a_max >= b_min)

def rect_overlaps(r1, r2):
    """Judge whether the two rectangles have intersection"""
    return range_overlap(r1.left, r1.right, r2.left, r2.right) and range_overlap(r1.bottom, r1.top, r2.bottom, r2.top)

def rect_merge(r1, r2, mergeThresh):
    """Calculate the merge area of two rectangles"""
    if rect_overlaps(r1, r2):
        SI = abs(min(r1.right, r2.right) - max(r1.left, r2.left)) * abs(min(r1.top, r2.top) - max(r1.bottom, r2.bottom))
        SA = abs(r1.right - r1.left) * abs(r1.top - r1.bottom)
        SB = abs(r2.right - r2.left) * abs(r2.top - r2.bottom)
        S = SA + SB - SI

        ratio = float(SI) / float(S)

        if ratio > mergeThresh:
            return 1
    return 0

def generateBoundingBox(featureMap, scale):
    boundingBox = []
    """We can calculate the stride from the architecture of the alexnet"""
    stride = 32
    """We need to get the boundingbox whose size is 227 * 227. When we trained the alexnet,
    we also resize the size of the input image to 227 * 227 in caffe"""
    cellSize = 227

    for (x, y), prob in np.ndenumerate(featureMap):
        if(prob >= 0.50):
            """Get the bounding box: we record the left-bottom and right-top coordinates"""
            boundingBox.append([float(stride * y) / scale, float(stride * x) / scale, float(stride * y + cellSize - 1) / scale,
                               float(stride * x + cellSize - 1) / scale, prob])
    return boundingBox

def nms_average(boxes, groupThresh = 2, overlapThresh=0.2):
    rects = []

    for i in range(len(boxes)):
        if boxes[i][4] > 0.2:
            """The box in here, we record the left-bottom coordinates(y, x) and the height and width"""
            rects.append([boxes[i, 0], boxes[i, 1], boxes[i, 2] - boxes[i, 0], boxes[i, 3] - boxes[i, 1]])

    rects, weights = cv2.groupRectangles(rects, groupThresh, overlapThresh)

    rectangles = []
    for i in range(len(rects)):
        testRect = Rect(Point(rects[i, 0], rects[i, 1]), Point(rects[i, 0] + rects[i, 2], rects[i, 1] + rects[i, 3]))
        rectangles.append(testRect)
    clusters = []
    for rect in rectangles:
        matched = 0
        for cluster in clusters:
            if (rect_merge(rect, cluster, 0.2)):
                matched = 1
                cluster.left = (cluster.left + rect.left) / 2
                cluster.right = (cluster.right + rect.right) / 2
                cluster.bottom = (cluster.bottom + rect.bottom) / 2
                cluster.top = (cluster.top + rect.top) / 2
        if (not matched):
            clusters.append(rect)

    result_boxes = []
    for i in range(len(clusters)):
        result_boxes.append([clusters[i].left, clusters[i].bottom, clusters[i].right, clusters[i].top, 1])

    return result_boxes

def face_detection(imgFlie):
    net_fully_conv = caffe.Net(r"C:\Users\Administrator\Desktop\FaceDetection\deploy_full_conv.prototxt",
                               r"C:\Users\Administrator\Desktop\FaceDetection\full_conv.caffemodel",
                               caffe.TEST)

    scales = []
    factor = 0.793700526

    img = cv2.imread(imgFlie)
    print(img.shape)

    largest = min(2, 4000 / max(img.shape[0:2]))
    scale = largest
    minD = largest * min(img.shape[0:2])
    while minD >= 227:
        scales.append(scale)
        scale *= factor
        minD *= factor
    total_boxes = []

    for scale in scales:
        scale_img = cv2.resize(img, (int(img.shape[0] * scale), int(img.shape[1] * scale)))
        cv2.imwrite(r"C:\Users\Administrator\Desktop\FaceDetection\scale_img.jpg", scale_img)
        im = caffe.io.load_image(r"C:\Users\Administrator\Desktop\FaceDetection\scale_img.jpg")

        """Change the test input data size of the scaled image size """
        net_fully_conv.blobs['data'].reshape(1, 3, scale_img.shape[1], scale_img.shape[0])
        transformer = caffe.io.Transformer({'data': net_fully_conv.blobs['data'].data.shape})
        transformer.set_transpose('data', (2, 0, 1))
        transformer.set_channel_swap('data', (2, 1, 0))
        transformer.set_raw_scale('data', 255.0)

        out = net_fully_conv.forward_all(data=np.asarray([transformer.preprocess('data', im)]))
        print(out['prob'][0, 1].shape)

        boxes = generateBoundingBox(out['prob'][0, 1], scale)

        if (boxes):
            total_boxes.extend(boxes)
    print(total_boxes)
    boxes_nms = np.array(total_boxes)
    true_boxes = nms_average(boxes_nms, 1, 0.2)

    if (not true_boxes == []):
        (x1, y1, x2, y2) = true_boxes[0][:-1]
        cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0))
        win = cv2.namedWindow('face detection', flags=0)
        cv2.imshow('face detection', img)
        cv2.waitKey(0)

if __name__ == "__main__":
    img = r"C:\Users\Administrator\Desktop\FaceDetection\tmp9055.jpg"
    face_detection(img)

因为电脑配置实在是太低了，所以训练了好久，电脑开着跑了好几天，也没有训练很多次。所以模型训练的不是很好。本例中，经过调参，发现生成boundingbox时，prob设置为大于等于0.5得到的结果较好。结果如下：

另外，也是用过tensorflow写过训练代码，但是由于电脑太差，训练速度太慢、精度太差。待以后慢慢再进一步学习。

注：本人正在学习AI相关知识，本例只是通过视频学习加上自己动手操作实现人脸检测功能，仅供自己学习记录。

标签：Administrator,conv,检测,Desktop,学习,boxes,train,人脸,caffe
来源： https://www.cnblogs.com/xjlearningAI/p/11067200.html

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9

人脸检测学习