Object detection is a task included in the area of Computer Vision consisting in an automatic method for locating interesting objects in an image with respect to the background. One of the models that can be used to perform this task is Haar Cascade Classifier (HCC).
HCC needs a dataset composed by positive images (that contain the at least one instance of the object to be detected) and negative images. Also, positive samples must be extracted from positive images, giving the coordinates of the rectangles containing each sample.
It is mandatory to correctly locate the positive and negative images. To make a list of the paths for the negative ones and save it to a text file («neg.txt»), we can just make a simple Python script:
import os def generatate_negative_description_file(myObj): with open("data\\" + myObj + "\\neg.txt", "w") as f: for filename in os.listdir("data\\" + myObj + "\\Negative"): f.write("negative/" + filename + "\n") generatate_negative_description_file("Man")
On the other hand, for the positive images, it is mandatory to have not only the paths to the images, but also the number of positive samples in that image and the coordinates of each one within the image. We will easily construct this file («pos.txt») using an OpenCV tool called «opencv_annotations.exe«. This application must be ran in command line with the following command:
opencv_annotation.exe --annotations=pos.txt --images=Positive/
This will open a pop up for each image (one by one) and will let us select the areas where the samples are located. Once we have this text file ready, we need to create the vector file with the samples («pos.vec»). We will do so using another OpenCV command line tool called «opencv_createsamples.exe» and we will run it with this command:
opencv_createsamples.exe -info pos.txt -w 24 -h 24 -num 1000 -vec pos.vec
Now we have all the elements needed to train the HCC model. To sum up, this is the workflow needed to have all the elements ready for the training:
As you can see in the image above, now it is time to use another OpenCV command line tool: «opencv_traincascade.exe«. For this example, I used the following line command, but the parameters can be changed depending on the volume of our dataset:
opencv_traincascade.exe -data Cascade/ -vec pos.vec -bg neg.txt -w 24 -h 24 -precalcValBufSize 6000 -precalcIdxBufSize 6000 -numPos 150 -numNeg 1000 -numStages 12 -maxFalseAlarmRate 0.3 -minHitRate 0.999
This will output an xml file (the trained output itself) that can be loaded through Python + OpenCV.
import cv2 # Loading model cascade = cv2.CascadeClassifier('cascade.xml') # Getting rectangles from HCC rectangles = cascade_man.detectMultiScale(frame, minSize = (50, 50)) # Drawing rectangles and labels for (x,y,w,h) in rectangles: cv2.putText(img = frame, text = "PEPE", org = (x, y - 7)) cv2.rectangle(img = frame, pt1 = (x,y), pt2 = (x + w, y + h)) # Showing output image cv2.imshow("Capturing", frame)