A research project implementing innovative approaches to multi-user tracking in retail environments, focusing on robust single-camera tracking and effective cross-camera re-identification (Re-ID).
Repository: https://github.com/Melodiz/MelMOT
This research details the development of an innovative system for continuous multi-user tracking within retail environments, focusing on robust single-camera tracking and effective cross-camera re-identification (Re-ID). A key contribution for single-camera tracking is the application of tailored post-processing heuristics addressing phantom tracks and manikin misclassifications to a state-of-the-art pipeline (YOLOv12, BoT-SORT, OSNet). The primary innovation lies in the cross-camera Re-ID methodology, which overcomes the limitations of appearance-only approaches in challenging retail settings. This is achieved through a robust spatiotemporal matching strategy utilizing homography to establish a common ground plane for coordinate-based comparisons. A novel exponential loss function is introduced for similarity scoring, significantly improving matching accuracy, particularly in scenarios with partial occlusions or non-ideal detections. The system architecture proposes a multi-stage matching process, where spatiotemporal constraints primarily guide Re-ID, with aggregated appearance features from multiple confirmed views providing a fallback mechanism, ensuring high accuracy and stability in generating comprehensive visitor trajectories.
This system constructs continuous trajectories for every visitor using a network of static, ceiling-mounted surveillance cameras. The data can be utilized for analyzing customer movement within shopping malls, including:
- Conversion tracking
- Behavioral pattern analysis
- High-interest area identification
- Business intelligence statistics
The system consists of two main components:
- Detection: YOLOv12 (state-of-the-art object detection)
- Tracking: BoT-SORT (robust online tracking)
- Re-ID: OSNet (appearance feature extraction)
- Post-processing: Heuristics for phantom track removal and manikin filtering
- Spatiotemporal matching: Homography-based coordinate alignment
- Appearance matching: Aggregated feature comparison
- Multi-stage process: Spatiotemporal constraints with appearance fallback
- Real-time processing at 30 FPS
- Robust tracking in crowded retail environments
- Cross-camera linking using geometric and appearance cues
- Post-processing heuristics for improved accuracy
- Comprehensive trajectory generation across entire mall visits
- Python 3.8+
- CUDA-compatible GPU (recommended)
- 8GB+ RAM
- Time-synchronized surveillance cameras
-
Clone the repository
git clone https://github.com/Melodiz/MelMOT.git cd MelMOT -
Create virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Download models (optional - will download automatically on first run)
python -c "from ultralytics import YOLO; YOLO('yolov12n.pt')"
from melmot.tracking import SingleCameraTracker
# Initialize tracker
tracker = SingleCameraTracker(
model_path="yolov12n.pt",
max_age=100,
min_hits=3
)
# Process video
results = tracker.track_video("path/to/video.mp4")from melmot.reid import CrossCameraReID
# Initialize Re-ID system
reid_system = CrossCameraReID(
homography_matrices=homography_data,
appearance_threshold=0.8
)
# Link tracklets across cameras
global_trajectories = reid_system.link_tracklets(camera_tracklets)# Single camera tracking
python -m melmot.cli track --video videos/input.mp4 --output results/output.mp4
# Cross-camera Re-ID
python -m melmot.cli reid --tracklets results/tracklets.json --output results/global_trajectories.json
# Full pipeline
python -m melmot.cli pipeline --config config/retail_mall.yamlCreate a configuration file config/retail_mall.yaml:
# Camera configuration
cameras:
- id: "cam_001"
position: [0, 0, 3.5] # x, y, height in meters
homography: "homography/cam_001.json"
- id: "cam_002"
position: [10, 0, 3.5]
homography: "homography/cam_002.json"
# Tracking parameters
tracking:
max_age: 100
min_hits: 3
iou_threshold: 0.3
static_threshold: 2.0
# Re-ID parameters
reid:
appearance_threshold: 0.8
spatial_tolerance: 0.5 # meters
temporal_window: 5.0 # seconds{
"camera_001": {
"track_001": [
{
"frame": 100,
"bbox": [x, y, w, h],
"confidence": 0.95,
"features": [0.1, 0.2, ...]
}
]
}
}{
"person_001": {
"global_id": "person_001",
"trajectories": {
"camera_001": {"start_frame": 100, "end_frame": 200},
"camera_002": {"start_frame": 250, "end_frame": 350}
},
"total_duration": 15.0
}
}This project introduces several novel approaches:
- Tailored post-processing heuristics for phantom track removal and manikin misclassification
- Robust spatiotemporal matching using homography for coordinate-based comparisons
- Novel exponential loss function for similarity scoring in challenging scenarios
- Multi-stage matching process with spatiotemporal constraints and appearance fallback
- MOTA: Competitive with state-of-the-art methods
- ID Switches: Significantly reduced through post-processing
- Processing Speed: Real-time capable (30 FPS)
- Accuracy: Robust in crowded retail environments
# Run all tests
pytest tests/
# Run specific test categories
pytest tests/test_tracking.py
pytest tests/test_reid.py
pytest tests/test_utils.py
# Run with coverage
pytest --cov=melmot tests/- Follow PEP 8 guidelines
- Use type hints
- Write comprehensive docstrings
- Maintain 90%+ test coverage
melmot/
├── core/ # Core tracking algorithms
├── detection/ # Object detection models
├── reid/ # Re-identification modules
├── utils/ # Utility functions
├── config/ # Configuration files
├── tests/ # Test suite
└── examples/ # Usage examples
- YOLOv12: State-of-the-art object detection
- BoT-SORT: Robust online tracking
- OSNet: Omni-scale network for Re-ID
- Homography-based matching: Geometric constraint utilization
The complete research paper is available in this repository:
- Source: main.tex (LaTeX source)
- PDF: Download Research Paper
- Topic: Innovative Approaches to Multi-User Tracking in Retail Spaces
- Focus: Single-camera MOT and cross-camera Re-ID methodologies
This project is licensed under the MIT License - see the LICENSE file for details.
Note: This is a research project for coursework. The implementation focuses on demonstrating novel approaches rather than production deployment.