[ Dlib ] 利用 Dlib訓練 Shape Predictor

本篇純為筆記，尚未有實際應用，只是先寫下來擔心自己忘了，同時希望也能幫助到一些朋友。
可能有很多地方有錯，歡迎糾正。

Dlib的安裝檔中有個 python_examples的資料夾，如果沒有可以到 Dlib的 GitHub中下載。
開始之前需要先確認有安裝 Dlib和 scikit-image，scikit-image可以透過 pip安裝，imglab就有點麻煩了，可以參考官方文檔 imglab的部分。
python_examples中會有一個 train_shape_predictor.py的文件，直接執行 train_shape_predictor.py就行了，但要附上要訓練的資料。直接使用 Dlib提供練習用的訓練數據，路徑是上一層 examples資料夾中的 faces，其輸入格式如下

python3 train_shape_predictor.py ../examples/faces/

faces資料夾中，除了照片外，同時也有 training_with_face_landmarks.xml和 testing_with_face_landmarks.xml，這兩個檔案是訓練用的重要數據。
執行 train_shape_predictor.py時只要將訓練的資料導向這個資料夾就行了，最後就會秀出幾張測試用的照片測試剛訓練出來的模型。同時也會生成出 predictor.dat和 image_metadata_stylesheet.xsl文件，predictor.dat文件就是拿來捕捉特徵用的模型，另一個我還不知道是做什麼用的 :P
在秀出圖片前會顯示 Training accuracy和 Testing accuracy。

左圖是作為訓練用的照片，右圖則是做為測試用的照片，可以看出對 landmarks捕捉還不夠成熟，但還算有點樣子。

接下來最麻煩的就是準備訓練用的數據。
Dlib有提供一個 tool叫 imglab可以幫助製作訓練用的數據。安裝的方式跟安裝 Dlib很類似。

cd dlib/tools/imglab
mkdir build
cd build
cmake ..
cmake --build . --config Release

安裝成功後，可以到 dlib/tools/imglab/build中找到 imglab檔案。
如果不想在這個資料夾中工作可以把 imglab複製到想要拿來工作的資料夾中。
假設要訓練的圖片都集中在一個叫做 tzuyu的資料夾中。可以輸入以下指令先建立一個 mydataset.xml文件，同時也會自動捕捉 tzuyu資料夾中的照片，並將路徑儲存在 xml文件中。

./imglab -c mydataset.xml ./tzuyu

接著可以透過以下指令開啟編輯器。

./imglab mydataset.xml

視窗的左邊會顯示資料夾中有的圖像。(我只有放一張，所以只有一個檔案)

可以按著 Shift + 滑鼠左鍵選取要捕捉的部分。若選錯可以用滑鼠左鍵雙擊紅色框框，框框變青色就可以按 delete刪除了。

在 Menu/File裡點擊 Save，就可以把剛剛的標注儲存在 mydataset.xml中。
但要進行訓練光用方塊標注可能不夠，還需要對一些特徵進行標注。接續著剛的 mydataset.xml繼續處理。假設僅對圖像標註五個特徵。

./imglab mydataset.xml --parts "1 2 3 4 5"

輸入以上指令就會回到剛的視窗，滑鼠左鍵雙擊紅色的框框，框框變青色就可以框框進行標注，在想標注的地方點擊滑鼠右鍵，選擇標著的記號。

儲存後， mydataset.xml就會有剛標注的數據了。接著就可以用 mydataset.xml來進行訓練了。

最後需要注意的是訓練可能需要 Training Data和 Testing Data，所以需要編輯兩份 xml檔。
還有就是 train_shape_predictor.py在 line 84和 97，讀取的檔案命名是 training_with_face_landmarks.xml和 testing_with_face_landmarks.xml，所以在命名的部分可能需要稍為注意一下。

# dlib.train_shape_predictor() does the actual training.  It will save the
# final predictor to predictor.dat.  The input is an XML file that lists the
# images in the training dataset and also contains the positions of the face
# parts.
training_xml_path = os.path.join(faces_folder, "training_with_face_landmarks.xml")
dlib.train_shape_predictor(training_xml_path, "predictor.dat", options)

# Now that we have a model we can test it.  dlib.test_shape_predictor()
# measures the average distance between a face landmark output by the
# shape_predictor and where it should be according to the truth data.
print("\nTraining accuracy: {}".format(
    dlib.test_shape_predictor(training_xml_path, "predictor.dat")))
# The real test is to see how well it does on data it wasn't trained on.  We
# trained it on a very small dataset so the accuracy is not extremely high, but
# it's still doing quite good.  Moreover, if you train it on one of the large
# face landmarking datasets you will obtain state-of-the-art results, as shown
# in the Kazemi paper.
testing_xml_path = os.path.join(faces_folder, "testing_with_face_landmarks.xml")
print("Testing accuracy: {}".format(
    dlib.test_shape_predictor(testing_xml_path, "predictor.dat")))

# Now let's use it as you would in a normal application.  First we will load it
# from disk. We also need to load a face detector to provide the initial
# estimate of the facial location.
predictor = dlib.shape_predictor("predictor.dat")
detector = dlib.get_frontal_face_detector()

本文參考了 Dlib文檔、TadaoYamaokaの日記和 Takuya Minagawa在 linkedin上發表的 dlibによる顔器官検出。

搜尋此網誌

HARDLIVER

[ Dlib ] 利用 Dlib訓練 Shape Predictor

留言

張貼留言

這個網誌中的熱門文章

[ Scikit-image ] 利用 SSIM篩選相似度過高的圖像

[ OpenCV ] 利用 OpenCV抓取相片中的臉部數據

[ Tor ] 使用 Tor進行網路爬蟲