Point d’étape du projet ANR PRC CLARA

Un point d’étape du projet ANR PRC CLARA (Couplage Apprentissage et Vision pour Contrôle de Robots Aériens) se fera le 12 et 13 sept à Dijon.

A cet égard, les membres du projets réaliseront des présentations le lundi 12 sept à partir de 14h30 en salle 101 d’I3M.

Voici le programme :

14h30 : Daniel Braun (ImViA / MIS)

Titre: The Usage of Quadtree in Deep Neural Networks to Represent data for Navigation from a Monocular Camera

Résumé: Monocular depth prediction networks have demonstrated their capability to accurately predict the geometry of the scene. Yet, most of the methods focus their interest on improving the prediction accuracy at the expense of computation time, which does not fit the use for navigation. We propose exploring the use of quadtree data structure representation to reduce the computation cost while focusing on the most significant part of the images.

15h30 : Charles-Olivier Artizzu (I3S / ImViA)

Titre : OMNI-CONV: Perspective Methods Transferred to Omnidirectional Images using Distortion-aware Convolutions 



Résumé : Omnidirectional cameras offer great applications in many computer vision tasks thanks to their ultra-wide Field-of-View (FoV). However, spherical images suffer from great distortions. Current computer vision algorithms use Convolutional Neural Networks (CNNs), which are very sensitive to domain shift. Therefore, we propose an easy-to-use spherical adaptation to improve these predictions without network architecture modification or time-consuming additional training. To demonstrate the generalization and simplicity of our method, we implement this adaptation for three commonly used visual modalities: semantic segmentation, optical flow, and depth prediction. In each case, the modified network version outperforms its baseline. 

16h30 : Zongwei Wu  (ImViA / I3S)

Titre: Depth Attention is (not) All You Need but helpful for Scenes Understanding

Résumé: Deep learning models can nowadays teach a machine to realize a number of tasks, even with better precision than human beings. Among all the modules of an intelligent machine, perception is the most essential part without which all other action modules have difficulties in safely and precisely realizing the target task under complex scenes. Conventional perception systems are based on RGB images which provide rich texture information about the 3D scene. However, the quality of RGB images highly depends on environmental factors, which further influence the performance of deep learning models. Therefore, in this thesis, we aim to improve the performance and robustness of RGB models with complementary depth cues by proposing novel RGB-D fusion designs.

Traditionally, pixel-wise concatenation with addition and convolution is the widely applied approach for RGB-D fusion designs. Inspired by the success of attention modules in deep networks, in this thesis we analyze and propose different depth-aware attention modules and demonstrate our effectiveness in basic segmentation tasks such as saliency detection and semantic segmentation. First, we leverage the geometric cues and propose a novel depth-wise channel of attention. We merge the fine-grained details and the semantic cues to constrain the channel attention into various local regions, improving the model discriminability during the feature extraction. Second, we investigate the depth-adapted offset which serves as a local but deformable spatial attention for convolution. Our approach forces the networks to take more relevant pixels into account with the help of depth prior. Third, we improve the contextualized awareness within RGB-D fusion by leveraging transformer attention. We show that transformer attention can improve the model robustness against feature misalignment. Last but not least, we focus on fusion architecture by proposing an adaptive fusion design. We learn the trade-off between early and late fusion with respect to the depth quality, yielding a more robust manner to merge RGB-D cues for deep networks. Extensive comparisons on the reference benchmarks validate the effectiveness of our proposed methods compared to other fusion alternatives.

Google Map

Ajouter au calendrier Tous les événements