Keypoint detection and tracking are fundamental tasks in computer vision, critical for applications like visual odometry, SLAM, and object recognition. While traditional frame-based methods have made significant progress, event-based cameras offer unique advantages, such as high temporal resolution and low latency, making them ideal for dynamic scenes.
However, the challenge lies in accurately detecting and tracking keypoints from the sparse and asynchronous event data, where feature appearances are highly dependent on camera motion.

This internship focuses on using machine learning to detect, track, and compute descriptors for keypoints from event-based data. By leveraging well-studied losses from frame-based approaches.
The goal is to develop a robust system that generalizes well to arbitrary scenes, making it a versatile tool for various vision-based applications.

Main missions

  • Model Development: Develop and train a machine learning model to detect and track keypoints from event-based data. The model should also compute descriptors around these keypoints, which are essential for tasks like visual odometry.
  • Loss Function Implementation: Implement and adapt repeatability and reliability loss functions from frame-based papers (e.g., ALIKE, SILK) to the event-based domain. These losses will help ensure that the detected keypoints are reliable and consistent across frames.
  • Flow Map Generation: Extend the model to output optical flow maps that predict the movement of keypoints between frames. This flow information will be critical for robust keypoint tracking.
  • Data Utilization: Use the EVIMOv2 dataset, which provides event data along with ground truth camera poses and depth, to train and validate the model. Ensure the model leverages this ground truth to learn accurate keypoint locations, descriptors, and flow.
  • Model Optimization: Continuously improve the model architecture to enhance accuracy and performance. This may involve experimenting with different network designs and training strategies.
  • Postprocessing Pipeline: Build an inference and postprocessing pipeline that utilizes the keypoints, descriptors, and flow information to achieve robust keypoint tracking in real-time applications.
  • Self-Supervised Finetuning: Finetune the model on various event datasets using self-supervised losses, ensuring it generalizes well to different scenarios and environments.

Requirements

Technical Skills:

    • Master 2 Degree
    • Programming skills in Python, with experience in machine learning and computer vision libraries (e.g., PyTorch, OpenCV)
    • Understanding of keypoint detection, feature descriptors, and optical flow in the context of computer vision
    • Familiarity with event-based camera technology is a plus but not required
    • Problem-Solving Ability: Analytical skills to tackle complex challenges and improve model performance iteratively
    • Communication Skills: Written and verbal communication skills for documentation and team collaboration
    • Passion for Innovation: A keen interest in exploring and applying cutting-edge techniques in the rapidly evolving field of computer vision

    Our Benefits

    Salary package : 80% SMIC (9,32 €/h)
    Meal vouchers : 9€ / working day (Swile card)
    Working hours : 35h / week
    Office location : Paris Bastille (74 rue du Faubourg Saint Antoine 75012)
    Other benefits : CSE, Incluson & Diversity

    We believe in fostering an inclusive environment where everyone is valued and respected. We welcome candidates from diverse backgrounds and are committed to ensuring equal opportunities for all, regardless of race, gender, disability, or any other characteristic.

    Location

    Paris, FR

    Job Overview
    Job Posted:
    1 month ago
    Job Expires:
    Job Type
    Intern

    Share This Job: