[ Dlib ] 利用 Dlib進行臉部捕捉

為了更高品質的臉部數據，特別研究了一下 Face Alignment，相較 Haar等方式，似乎 Dlib更能有效地捕捉臉部，再利用 Face Alignment能有效地將捕捉的臉部校正，以取得更高的數據品質。
Dlib安裝稍微有點複雜，可以參考由 Adrian Rosebrock寫的 "How to install dlib"。

from imutils import face_utils
import numpy as np
import imutils
import dlib
import cv2
 
 
face_cascade = cv2.CascadeClassifier('./cascades/haarcascade_frontalface_default.xml')
 
def detect_haar(filename):
    img = cv2.imread(filename)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray,
                                          scaleFactor=1.1,
                                          minNeighbors=3,)
    for (x,y,w,h) in faces:
        img = cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),3)
    cv2.imwrite('./haar.png', img)
 
 
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
 
def detect_face_landmarks(filename):
    img = cv2.imread(filename)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = detector(gray, 1)
 
    for face in faces:
        shape = predictor(gray, face)
        shape = face_utils.shape_to_np(shape)
 
        (x, y, w, h) = face_utils.rect_to_bb(face)
        cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 3)
 
        for (x, y) in shape:
            cv2.circle(img, (x, y), 5, (0, 0, 255), -1)
            # 圖像、圓心、半徑、顏色、第五個參數正數為線的粗細，負數則為填滿
 
    cv2.imwrite('./face_landmarks.png', img)
 
 
filename = "twice.jpg"
detect_haar(filename)
detect_face_landmarks(filename)

代碼中使用的 shape_predictor_68_face_landmarks.dat，可以點擊這裡下載。
利用 detector = dlib.get_frontal_face_detector()預測灰化後的圖像，評估有幾張臉在這張圖像中。
利用 predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")在捕捉的臉部預測臉部 landmarks。

用 Haar的方式，在條件非常寬鬆的條件下，只有辨識到一張臉，甚至左下角也誤將商標誤認為臉部。
相較之下，Dlib較能正確得辨識出圖中的兩張臉，另位臉部 landmarks也能有效的識別。
能有效的辨識 landmarks就能使用 FaceAligner功能將臉部對齊。
FaceAligner是利用 landmarks雙眼的部分將圖像調整為雙眼在水平的狀態。

from imutils.face_utils import FaceAligner
from imutils.face_utils import rect_to_bb
import numpy as np
import imutils
import dlib
import cv2
 
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
fa = FaceAligner(predictor, desiredFaceWidth=200)
 
face_filename = 1
def detect_face_landmarks(filename):
    img = cv2.imread(filename)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = detector(gray, 1)
 
    for face in faces:
        (x, y, w, h) = rect_to_bb(face)
        faceOrig = imutils.resize(img[y : y+h, x : x+w], width=200)
        faceAligned = fa.align(img, gray, face)
        global face_filename
        cv2.imwrite('./faceOrig_{0}.png'.format(face_filename), faceOrig)
        cv2.imwrite('./faceAligned_{0}.png'.format(face_filename), faceAligned)
        face_filename += 1
 
 
filename = "twice.jpg"
detect_face_landmarks(filename)

FaceAligner導入 predictor辨識臉部 landmarks，就能有效將臉部對齊。
對齊後直接就可以透過 OpenCV輸出。
fa.align參數分別是要擷取的圖像、要被辨識的圖像(灰階)、要對齊的圖像。

對齊前的圖像雖有抓到臉部，但可能有各種角度，且臉部呈現並不完整，這樣的數據可能會使訓練的難度提升。

對齊後的圖像有效得將臉部對齊，且將臉部整個輪廓完整呈現，也許能幫助訓練時更有效的找到特徵。

稍微修改一下就可以拿來捕捉影片中的臉部圖像。

from imutils.face_utils import FaceAligner
from imutils.video import count_frames
import imutils
import numpy as np
import dlib
import cv2
 
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
fa = FaceAligner(predictor, desiredFaceWidth=64)
 
face_filename = 1
def detect(img, idx, totle):
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = detector(gray, 1)
 
    for face in faces:
        faceAligned = fa.align(gray, gray, face)
        global face_filename
        cv2.imwrite('./face/{0}.png'.format(face_filename), faceAligned)
        face_filename += 1
    print('Working with {0} frame. completed {1:.2f} %'.format(idx, idx/float(totle)*100))
 
 
detect_video = 'tzuyu.mp4'
videoCapture = cv2.VideoCapture(detect_video)
success, frame = videoCapture.read()
frame_counter = 1
frame_totle = count_frames(detect_video)
while success:
    detect(frame, frame_counter, frame_totle)
    success, frame = videoCapture.read()
    frame_counter += 1
 
print("Done!!")

雖然利用 Dlib捕捉臉部精準度提升了許多，但還是有不少捕捉錯誤、臉部沒對齊和轉場效果等的需要手動移除。

之前沒有提到，利用影片捕捉可能會得到過多相似度太高的圖像，利用這些數據訓練可能會使權重失衡。
歡迎參考 "利用 PhotoHash篩選相似度過高的圖像"，利用 Hamming distance對相似度過高的圖像進行篩選。

本篇參考了 Adrian Rosebrock寫的 "Facial landmarks with dlib, OpenCV, and Python" 和 "Face Alignment with OpenCV and Python"。文中介紹的更加詳細，有興趣的朋友可以去拜訪 Adrian Rosebrock的網站。

搜尋此網誌

HARDLIVER

[ Dlib ] 利用 Dlib進行臉部捕捉

留言

張貼留言

這個網誌中的熱門文章

[ OpenCV ] 利用 OpenCV抓取相片中的臉部數據

[ Selenium ] 偽造身份進行網路爬蟲

[ Scikit-image ] 利用 SSIM篩選相似度過高的圖像