2D Face detection on Raspberry Pi 5 with Hailo

TL;DR: I made a helper class that helps parse output of the SCRFD model. Now you can detect faces without understanding the method.

One of the most common examples of computer vision presented online is the task of detecting faces in images. There is something inherently interesting about knowing that a face appears in a photo, and the technology also has many practical applications. Today, most smartphones can perform this task effortlessly every time you take a photo, even in real time.

So when the Raspberry Pi AI HAT was released, face detection was the first task I wanted to try. While Hailo offers plenty of online examples, they are integrated into its own pipeline to make the task easier for end users. That is not necessarily a bad thing, but if you work with cameras more extensively, you quickly begin to notice the limitations of the pipeline. To name just a few that were important to me:

  • It does not work with using several cameras at the same time,
  • You cannot load multiple models at the same time,
  • You don’t have the control over the model loading and decoding of outputs (useful when you run multiple models or your own models).

While they offer TAPPAS and the possibility of creating a more complex pipeline, this also makes the setup more complicated and quickly introduces a number of dependencies into your project.

So in  my recent work I needed to use face detection and a preprocessing step in the image processing pipeline. I was looking for a minimal code that can detect faces ideally explained without adding software dependencies. The only dependency I have is the HailoRT library to load and run the face detection model. Since I am not interested in training my own face detector, I looked at the Hailo model ZOO and found the SCRFD model. Popular, fast and impressive model introduced by InsightFace that works impressively well. However, the output from the model is far from being used directly. If you would expect a list of face coordinates, you would be disappointed. You need some understanding of the model itself. 

However, to save some time I created a FaceDetector class that will help you get the results.

The use is simple, you initialize the class before you start your per frame processing:

face_detector = FaceDetector()

and then you call the processFrame() method in your image acquisition loop:

faces = face_detector.find_faces(outputs)
                    
for x1, y1, x2, y2, score, keypoints in faces:
    cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
    for x, y in keypoints:
        cv2.circle(frame, (int(x), int(y)), 3, (0, 0, 255), -1)

The only warning is that the initialization should happen in the same thread as the image loop. A sample main method running the model you can find here: https://github.com/lomelina/pi_resources/tree/main/hailo/face_detection