7.3 KiB
Face Detection and Embedding
A high-performance Rust implementation for face detection and face embedding generation using neural networks.
Overview
This project provides a complete face detection and recognition pipeline with the following capabilities:
- Face Detection: Detect faces in images using RetinaFace model
- Face Embedding: Generate face embeddings using FaceNet model
- Multiple Backends: Support for both MNN and ONNX runtime execution
- Hardware Acceleration: Metal, CoreML, and OpenCL support on compatible platforms
- Modular Design: Workspace architecture with reusable components
Features
- 🔍 Accurate Face Detection - Uses RetinaFace model for robust face detection
- 🧠 Face Embeddings - Generate 512-dimensional face embeddings with FaceNet
- ⚡ High Performance - Optimized with hardware acceleration (Metal, CoreML)
- 🔧 Flexible Configuration - Adjustable detection thresholds and NMS parameters
- 📦 Modular Architecture - Reusable components for image processing and bounding boxes
- 🖼️ Visual Output - Draw bounding boxes on detected faces
Architecture
The project is organized as a Rust workspace with the following components:
detector- Main face detection and embedding applicationbounding-box- Geometric operations and drawing utilities for bounding boxesndarray-image- Conversion utilities between ndarray and image formatsndarray-resize- Fast image resizing operations on ndarray data
Models
The project includes pre-trained neural network models:
- RetinaFace - Face detection model (
.mnnand.onnxformats) - FaceNet - Face embedding model (
.mnnand.onnxformats)
Usage
Basic Face Detection
# Detect faces using MNN backend (default)
cargo run --release detect path/to/image.jpg
# Detect faces using ONNX Runtime backend
cargo run --release detect --executor onnx path/to/image.jpg
# Save output with bounding boxes drawn
cargo run --release detect --output detected.jpg path/to/image.jpg
# Adjust detection sensitivity
cargo run --release detect --threshold 0.9 --nms-threshold 0.4 path/to/image.jpg
Face Comparison
Compare faces between two images by computing and comparing their embeddings:
# Compare faces in two images
cargo run --release compare image1.jpg image2.jpg
# Compare with custom thresholds
cargo run --release compare --threshold 0.9 --nms-threshold 0.4 image1.jpg image2.jpg
# Use ONNX Runtime backend for comparison
cargo run --release compare -p cpu image1.jpg image2.jpg
# Use MNN with Metal acceleration
cargo run --release compare -f metal image1.jpg image2.jpg
The compare command will:
- Detect all faces in both images
- Generate embeddings for each detected face
- Compute cosine similarity between all face pairs
- Display similarity scores and the best match
- Provide interpretation of the similarity scores:
- > 0.8: Very likely the same person
- 0.6-0.8: Possibly the same person
- 0.4-0.6: Unlikely to be the same person
- < 0.4: Very unlikely to be the same person
Backend Selection
The project supports two inference backends:
- MNN Backend (default): High-performance inference framework with Metal/CoreML support
- ONNX Runtime Backend: Cross-platform ML inference with broad hardware support
# Use MNN backend with Metal acceleration (macOS)
cargo run --release detect --executor mnn --forward-type metal path/to/image.jpg
# Use ONNX Runtime backend
cargo run --release detect --executor onnx path/to/image.jpg
Command Line Options
# Face detection with custom parameters
cargo run --release detect [OPTIONS] <IMAGE>
Options:
-m, --model <MODEL> Custom model path
-M, --model-type <MODEL_TYPE> Model type [default: retina-face]
-o, --output <OUTPUT> Output image path
-e, --executor <EXECUTOR> Inference backend [mnn, onnx]
-f, --forward-type <FORWARD_TYPE> MNN execution backend [default: cpu]
-t, --threshold <THRESHOLD> Detection threshold [default: 0.8]
-n, --nms-threshold <NMS_THRESHOLD> NMS threshold [default: 0.3]
Quick Start
# Build the project
cargo build --release
# Run face detection on sample image
just run
# or
cargo run --release detect ./1000066593.jpg
Hardware Acceleration
MNN Backend
The MNN backend supports various execution backends:
- CPU - Default, works on all platforms
- Metal - macOS GPU acceleration
- CoreML - macOS/iOS neural engine acceleration
- OpenCL - Cross-platform GPU acceleration
# Use Metal acceleration on macOS
cargo run --release detect --executor mnn --forward-type metal path/to/image.jpg
# Use CoreML on macOS/iOS
cargo run --release detect --executor mnn --forward-type coreml path/to/image.jpg
ONNX Runtime Backend
The ONNX Runtime backend automatically selects the best available execution provider based on your system configuration.
Development
Prerequisites
- Rust 2024 edition
- MNN runtime (automatically linked)
- ONNX runtime (for ONNX backend)
Building
# Standard build
cargo build
# Release build with optimizations
cargo build --release
# Run tests
cargo test
Project Structure
├── src/
│ ├── facedet/ # Face detection modules
│ │ ├── mnn/ # MNN backend implementations
│ │ ├── ort/ # ONNX Runtime backend implementations
│ │ └── postprocess.rs # Shared postprocessing logic
│ ├── faceembed/ # Face embedding modules
│ │ ├── mnn/ # MNN backend implementations
│ │ └── ort/ # ONNX Runtime backend implementations
│ ├── cli.rs # Command line interface
│ └── main.rs # Application entry point
├── models/ # Neural network models (.mnn and .onnx)
├── bounding-box/ # Bounding box utilities
├── ndarray-image/ # Image conversion utilities
└── ndarray-resize/ # Image resizing utilities
Backend Architecture
The codebase is organized to support multiple inference backends:
- Common interfaces:
FaceDetectorandFaceEmbeddertraits provide unified APIs - Shared postprocessing: Common logic for anchor generation, NMS, and coordinate decoding
- Backend-specific implementations: Separate modules for MNN and ONNX Runtime
- Modular design: Easy to add new backends by implementing the common traits
License
MIT License
Dependencies
Key dependencies include:
- MNN - High-performance neural network inference framework (MNN backend)
- ONNX Runtime - Cross-platform ML inference (ORT backend)
- ndarray - N-dimensional array processing
- image - Image processing and I/O
- clap - Command line argument parsing
- bounding-box - Geometric operations for face detection
- error-stack - Structured error handling
Backend Status
- ✅ MNN Backend: Fully implemented with hardware acceleration support
- 🚧 ONNX Runtime Backend: Framework implemented, inference logic to be completed
Note: The ORT backend currently provides the framework but requires completion of the inference implementation.
Built with Rust for maximum performance and safety in computer vision applications.