uttarayan21 3eec262076
Some checks failed
build / checks-matrix (push) Successful in 19m22s
build / codecov (push) Failing after 19m26s
docs / docs (push) Failing after 28m51s
build / checks-build (push) Has been cancelled
feat(bounding-box): add scale_uniform method for consistent scaling
feat(gui): display face ROIs in comparison results

refactor(bridge): pad detected face bounding boxes uniformly
2025-08-22 19:01:34 +05:30
2025-07-14 16:22:26 +05:30
2025-08-21 18:28:39 +05:30
2025-08-18 22:31:37 +05:30
2025-08-18 22:10:29 +05:30
2025-07-14 16:22:26 +05:30
2025-08-18 22:31:37 +05:30
2025-08-18 11:31:03 +05:30
2025-08-21 18:28:39 +05:30
2025-07-14 16:22:26 +05:30
2025-08-21 18:28:39 +05:30
2025-08-21 18:28:39 +05:30
2025-08-21 18:28:39 +05:30
2025-08-13 18:08:03 +05:30

Face Detection and Embedding

A high-performance Rust implementation for face detection and face embedding generation using neural networks.

Overview

This project provides a complete face detection and recognition pipeline with the following capabilities:

  • Face Detection: Detect faces in images using RetinaFace model
  • Face Embedding: Generate face embeddings using FaceNet model
  • Multiple Backends: Support for both MNN and ONNX runtime execution
  • Hardware Acceleration: Metal, CoreML, and OpenCL support on compatible platforms
  • Modular Design: Workspace architecture with reusable components

Features

  • 🔍 Accurate Face Detection - Uses RetinaFace model for robust face detection
  • 🧠 Face Embeddings - Generate 512-dimensional face embeddings with FaceNet
  • High Performance - Optimized with hardware acceleration (Metal, CoreML)
  • 🔧 Flexible Configuration - Adjustable detection thresholds and NMS parameters
  • 📦 Modular Architecture - Reusable components for image processing and bounding boxes
  • 🖼️ Visual Output - Draw bounding boxes on detected faces

Architecture

The project is organized as a Rust workspace with the following components:

  • detector - Main face detection and embedding application
  • bounding-box - Geometric operations and drawing utilities for bounding boxes
  • ndarray-image - Conversion utilities between ndarray and image formats
  • ndarray-resize - Fast image resizing operations on ndarray data

Models

The project includes pre-trained neural network models:

  • RetinaFace - Face detection model (.mnn and .onnx formats)
  • FaceNet - Face embedding model (.mnn and .onnx formats)

Usage

Basic Face Detection

# Detect faces using MNN backend (default)
cargo run --release detect path/to/image.jpg

# Detect faces using ONNX Runtime backend
cargo run --release detect --executor onnx path/to/image.jpg

# Save output with bounding boxes drawn
cargo run --release detect --output detected.jpg path/to/image.jpg

# Adjust detection sensitivity
cargo run --release detect --threshold 0.9 --nms-threshold 0.4 path/to/image.jpg

Face Comparison

Compare faces between two images by computing and comparing their embeddings:

# Compare faces in two images
cargo run --release compare image1.jpg image2.jpg

# Compare with custom thresholds
cargo run --release compare --threshold 0.9 --nms-threshold 0.4 image1.jpg image2.jpg

# Use ONNX Runtime backend for comparison
cargo run --release compare -p cpu image1.jpg image2.jpg

# Use MNN with Metal acceleration
cargo run --release compare -f metal image1.jpg image2.jpg

The compare command will:

  1. Detect all faces in both images
  2. Generate embeddings for each detected face
  3. Compute cosine similarity between all face pairs
  4. Display similarity scores and the best match
  5. Provide interpretation of the similarity scores:
    • > 0.8: Very likely the same person
    • 0.6-0.8: Possibly the same person
    • 0.4-0.6: Unlikely to be the same person
    • < 0.4: Very unlikely to be the same person

Backend Selection

The project supports two inference backends:

  • MNN Backend (default): High-performance inference framework with Metal/CoreML support
  • ONNX Runtime Backend: Cross-platform ML inference with broad hardware support
# Use MNN backend with Metal acceleration (macOS)
cargo run --release detect --executor mnn --forward-type metal path/to/image.jpg

# Use ONNX Runtime backend
cargo run --release detect --executor onnx path/to/image.jpg

Command Line Options

# Face detection with custom parameters
cargo run --release detect [OPTIONS] <IMAGE>

Options:
  -m, --model <MODEL>              Custom model path
  -M, --model-type <MODEL_TYPE>    Model type [default: retina-face]
  -o, --output <OUTPUT>            Output image path
  -e, --executor <EXECUTOR>        Inference backend [mnn, onnx]
  -f, --forward-type <FORWARD_TYPE> MNN execution backend [default: cpu]
  -t, --threshold <THRESHOLD>      Detection threshold [default: 0.8]
  -n, --nms-threshold <NMS_THRESHOLD> NMS threshold [default: 0.3]

Quick Start

# Build the project
cargo build --release

# Run face detection on sample image
just run
# or
cargo run --release detect ./1000066593.jpg

Hardware Acceleration

MNN Backend

The MNN backend supports various execution backends:

  • CPU - Default, works on all platforms
  • Metal - macOS GPU acceleration
  • CoreML - macOS/iOS neural engine acceleration
  • OpenCL - Cross-platform GPU acceleration
# Use Metal acceleration on macOS
cargo run --release detect --executor mnn --forward-type metal path/to/image.jpg

# Use CoreML on macOS/iOS
cargo run --release detect --executor mnn --forward-type coreml path/to/image.jpg

ONNX Runtime Backend

The ONNX Runtime backend automatically selects the best available execution provider based on your system configuration.

Development

Prerequisites

  • Rust 2024 edition
  • MNN runtime (automatically linked)
  • ONNX runtime (for ONNX backend)

Building

# Standard build
cargo build

# Release build with optimizations
cargo build --release

# Run tests
cargo test

Project Structure

├── src/
│   ├── facedet/             # Face detection modules
│   │   ├── mnn/            # MNN backend implementations
│   │   ├── ort/            # ONNX Runtime backend implementations
│   │   └── postprocess.rs  # Shared postprocessing logic
│   ├── faceembed/          # Face embedding modules
│   │   ├── mnn/            # MNN backend implementations
│   │   └── ort/            # ONNX Runtime backend implementations
│   ├── cli.rs              # Command line interface
│   └── main.rs             # Application entry point
├── models/                 # Neural network models (.mnn and .onnx)
├── bounding-box/           # Bounding box utilities
├── ndarray-image/          # Image conversion utilities
└── ndarray-resize/         # Image resizing utilities

Backend Architecture

The codebase is organized to support multiple inference backends:

  • Common interfaces: FaceDetector and FaceEmbedder traits provide unified APIs
  • Shared postprocessing: Common logic for anchor generation, NMS, and coordinate decoding
  • Backend-specific implementations: Separate modules for MNN and ONNX Runtime
  • Modular design: Easy to add new backends by implementing the common traits

License

MIT License

Dependencies

Key dependencies include:

  • MNN - High-performance neural network inference framework (MNN backend)
  • ONNX Runtime - Cross-platform ML inference (ORT backend)
  • ndarray - N-dimensional array processing
  • image - Image processing and I/O
  • clap - Command line argument parsing
  • bounding-box - Geometric operations for face detection
  • error-stack - Structured error handling

Backend Status

  • MNN Backend: Fully implemented with hardware acceleration support
  • 🚧 ONNX Runtime Backend: Framework implemented, inference logic to be completed

Note: The ORT backend currently provides the framework but requires completion of the inference implementation.


Built with Rust for maximum performance and safety in computer vision applications.

Description
No description provided
Readme 407 MiB
Languages
Rust 97.7%
Nix 2.2%
Just 0.1%