2018 Results – Thomas Maugey

Record of activity

The main activity for the year 2018 was the co-organization of a graph-signal processing workshop, that took place in Lausanne in June during 3 days with 80 participants. This Graph Signal Processing Workshop (3rd edition) is a forum intended to disseminate the graph signal processing research field to a broader audience and to exchange ideas and experiences on the future path of this vibrant field.
We have also continued the work started during the visit of Mira Rizkallah in Nov-Dec 2017. This has led to an article presented at the EUSIPCO conference (2018). Mira Rizkallah received the best student paper award for this paper. In December 2018, we open a novel collaboration topic on the graph-based optimization of multi-source compression for interactive navigation. The problem that we propose to study is the joint compression of multiple sources (as the omnidirectional images taken from different viewpoint in our dataset) by enabling a user navigation among the sources. The joint work on this topic will begin in December 2018 during the visit of three members of the SIROCCO team (Inria) at EPFL: Aline Roumy and Thomas Maugey (1 week) and Mai-Q Pham (2 weeks). Another visit of Thomas Maugey is planed in December for the PhD defense of Renata Khasanova. This will be the occasion to pursue this collaboration.

Scientific outcomes

Graph-based calibration of multiple Omnidirectional cameras:
We have proposed a novel acquisition procedure for Free Viewpoint Television. This system relies on a synchronized capture of a scene by several dozens of omnidirectional cameras. We also propose a calibration solution that estimates the position and orientation of each camera with respect to a same reference. This solution relies on a regular calibration of each individual camera, and a graph-based synchronization of all these parameters. We validate our solution and we present how this acquisition procedure has been used to capture videos serving as support for research on 6 degrees-of-freedom (6DoF) experience.

Omnidirectional videos acquisition:
Based on these developed tools, we have built a complete dataset that we share on the following website https://project.inria.fr/ftv360. The dataset is made of several Captures including the following steps. (i) We position a certain number of omnidirectional cameras (typically 40) in a scene. Their distance with the neighboring ones lies between 1m and 3m. (ii) We record one or several calibration sequences, in which a chessboard pattern is moving in the scene. The recorded videos are then used to estimate the calibration parameters with our proposed algorithm. (iii) We record several Sequences with the same camera arrangement (and thus the same calibration parameters). In each sequence, a scene (1 min to 4 min) is acquired by all the synchronized cameras. Our dataset is made of two different captures, with, in total 8 different sequences (each of them having 40 synchronized videos). The calibration parameters are shared along with the calibration toolkit that is explained in this paper.
These data can serve for the development of grapĥ-based view synthesis algorithm in order to enable ultimate 6DoF navigation.

Omnidirectional Image graph-based compression:
Omnidirectional images are often mapped to planar domain. A commonly used planar representation is the equirectangular one, which corresponds to a non uniform sampling pattern on the spherical surface. This particularity is not explored in traditional image compression schemes, which treat the input signal as a classical perspective image. In this work, we have built a graph-based coder adapted to the spherical surface. We have constructed a graph directly on the sphere. Then, to have computationally feasible graph transforms, we propose a rate-distortion optimized graph partitioning algorithm to achieve an effective trade-off between the distortion of the reconstructed signals, the smoothness of the signal on each subgraph, and the cost of coding the graph partitioning description. Experimental results demonstrate that our method outperforms JPEG coding of planar equirectangular images.

Geometry-aware convolutional filters for omnidirectional images representation
Due to their wide field of view, omnidirectional cameras are frequently used by autonomous vehicles, drones and robots for navigation and other computer vision tasks. The images captured by such cameras, are often analyzed and classified with techniques designed for planar images that unfortunately fail to properly han- dle the native geometry of such images. That results in suboptimal performance, and lack of truly meaningful visual features. This work aims at improving popular deep convolutional neural networks so that they can properly take into account the specific properties of omnidirectional data. In particular we have proposed an algorithm that adapts convolutional layers, which often serve as a core
building block of a CNN, to the properties of omnidirectional images. Thus, our filters haveshape and size that adapts with the location on the omnidirectional image. We have shown that our method achieves better results compared to existing deep neu- ral network techniques for omnidirectional image classification. Finally we have shown that our methodnot limited to spherical surfaces and is able to incorporate the knowledge about any kindomnidirectional geometry inside the deep learning network.

Visual Distortions in 360-degree Videos
Current approaches for capturing, processing,delivering, and displaying 360-degree content present many opentechnical challenges and introduce several types of distortions in the visual signals. These distortions are specific to the nature of360-degree images, and often different from those encounteredin the classical image communication framework. We have provided a first comprehensive review of the most common visual distortions that alter 360-degree signals undergoing state of theart processing in common applications. While their impact on viewers’ visual perception and on the immersive experience at large is still unknown —thus, it stays an open research topic— this review serves the purpose of identifying the main causes of visual distortions in the end-to-end 360-degree content distribution pipeline. It is essential as a basis for benchmarking different processing techniques, allowing the effective designof new algorithms and applications. It is also necessary to the deployment of proper psychovisual studies to characterise the human perception of these new images in interactive and immersive applications.

Production

M. Rizkallah, F. De Simone, T. Maugey, C. Guillemot, P. Frossard, Rate Distortion Optimized Graph Partitioning for Omnidirectional Image Coding
EUSIPCO, Athens, Greece, Sept. 2018. Best student paper

M. Rizkallah, F. De Simone, T. Maugey, C. Guillemot, P. Frossard, Rate Distortion Optimized Graph Partitioning for Omnidirectional Image Coding
Graph Signal Processing Workshop, Lausanne, Switzerland, Jun. 2018.

R. Ma, T. Maugey, P. Frossard, Optimized Data Representation for Interactive Multiview Navigation, IEEE Transactions on Multimedia, 2018