[Apr 2019] New paper accepted in ACM Multimedia System conference

Title: FTV360: a Multiview 360° Video Dataset with Calibration Parameters

Authors: Thomas Maugey, Laurent Guillo, Cédric Le Cam

Abstract: In this paper, we present a new dataset in order to serve as a support for researches in Free Viewpoint Television (FTV) and 6 degrees-of-freedom (6DoF) immersive communication. This dataset relies on a novel acquisition procedure consisting in a synchronized capture of a scene by 40 omnidirectional cameras. We have also developed a calibration solution that estimates the position and orientation of each camera with respect to a same reference. This solution relies on a regular calibration of each individual camera, and a graph-based synchronization of all these parameters. These videos and the calibration solution are made publicly available.

[Feb 2019] One ICASSP paper accepted

Authors: F. Nasiri, N. Mahmoudian-Bigdoli, F. Payan, T. Maugey
Title: A geometry-aware framework for compressing 3D mesh textures, accepted in IEEE ICASSP 2019
Abstract: In this paper, we propose a novel prediction tool for improving the compression performance of texture atlases. This algorithm, called Geometry-Aware (GA) intra coding, takes advantage of the topology of the associated 3D meshes, in order to reduce the redundancies in the texture map. For texture processing, the general concept of the conventional intra prediction, used in video compression, has been adapted to utilize neighboring information on the 3D surface. We have also studied how this prediction tool can be integrated into a complete coding solution. In particular, a new block scanning strategy, as well as a graph-based transform for residual coding have been proposed. Experimental results show that the knowledge of the mesh topology can significantly improve the compression efficiency of texture atlases.

[Oct 2018] Our dataset available

Our dataset for Free Viewpoint Television is now available.

To check the details, download the videos and get our source codes used for the calibration, please visit our website: https://project.inria.fr/ftv360/

 

If you use any material (videos, codes) shared on this website for research purposes, please cite the following paper:

Thomas Maugey, Laurent Guillo, Cédric Le Cam, FTV360: a Multiview 360-degree Video Dataset with Calibration Parameters, ACM Multimedia Systems Conference, June 2019

[Nov 2017] New journal paper accepted in IEEE TMM

Authors: R. Ma, T. Maugey, P. Frossard

Title: Optimized Data Representation for Interactive Multiview Navigation accepted in IEEE Transactions on Multimedia, 2017

Abstract: In contrary to traditional media streaming services where a unique media content is delivered to different users, interactive multiview navigation applications enable users to choose their own viewpoints and freely navigate in a 3-D scene. The interactivity brings new challenges in addition to the classical rate-distortion trade-off, which considers only the compression performance and viewing quality. On the one hand, interactivity necessitates sufficient viewpoints for richer navigation; on the other hand, it requires to provide low bandwidth and delay costs for smooth navigation during view transitions. In this paper, we formally describe the novel trade-offs posed by the navigation interactivity and classical rate-distortion criterion. Based on an original formulation, we look for the optimal design of the data representation by introducing novel rate and distortion models and practical solving algorithms. Experiments show that the proposed data representation method outperforms the baseline solution by providing lower resource consumptions and higher visual quality in all navigation configurations, which certainly confirms the potential of the proposed data representation in practical interactive navigation systems.

[Jul 2017] New journal paper accepted in IEEE TIP

Authors: C. Verleysen, T. Maugey, C. De Vleeschouwer, P. Frossard

Title: Wide baseline image-based rendering based on shape prior regularisation accepted in IEEE Transactions on Image Processing, 2017

Abstract: We consider the synthesis of intermediate views of an object captured by two widely spaced and calibrated cameras. This problem is challenging because foreshortening effects and occlusions induce significant differences between the reference images when the cameras are far apart. That makes the association or disappearance/appearance of their pixels difficult to estimate. Our main contribution lies in disambiguating this illposed problem by making the interpolated views consistent with a plausible transformation of the object silhouette between the reference views. This plausible transformation is derived from an object-specific prior that consists of a nonlinear shape manifold learned from multiple previous observations of this object by the two reference cameras. The prior is used to estimate how the epipolar silhouette segments observed in the reference views evolve between those views. This information directly supports the definition of epipolar silhouette segments in the intermediate views, and the synthesis of textures in those segments. It permits to reconstruct the Epipolar Plane Images (EPIs) and the continuum of views associated with the Epipolar Plane Image Volume, obtained by aggregating the EPIs. Experiments on synthetic and natural images show that our method preserves the object topology in intermediate views and deals effectively with the selfoccluded regions and the severe foreshortening effect associated with wide-baseline camera configurations.

[Jul 2017] MMSP paper accepted

Authors: Thomas Maugey, Olivier Le Meur, Zhi Liu
Title: Saliency-based navigation in omnidirectional image, accepted in IEEE MMSP 2017
Abstract: Omnidirectional images describe the color information at a given position from all directions. Affordable 360° cameras have recently been developed leading to an explosion of the 360° data shared on the social networks. However, an omnidirectional image does not contain interesting content everywhere. Some part of the images are indeed more likely to be looked at by some users than others. Knowing these regions of interest might be useful for 360° image compression, streaming, retargeting or even editing. In this paper, a new approach based on 2D image saliency is proposed both to model the user navigation within a 360° image, and to detect which parts of an omnidirectional content might draw users’ attention.
Website: http://people.irisa.fr/Olivier.Le_Meur/publi/2017_MMSP/index.html

[Jun 2017] New MMSP submission

Authors: Thomas Maugey, Olivier Le Meur, Zhi Liu
Title: Saliency-based navigation in omnidirectional image, submitted to IEEE MMSP 2017
Abstract: Omnidirectional images describe the color information at a given position from all directions. Affordable 360° cameras have recently been developed leading to an explosion of the 360° data shared on the social networks. However, an omnidirectional image does not contain interesting content everywhere. Some part of the images are indeed more likely to be looked at by some users than others. Knowing these regions of interest might be useful for 360° image compression, streaming, retargeting or even editing. In this paper, a new approach based on 2D image saliency is proposed both to model the user navigation within a 360° image, and to detect which parts of an omnidirectional content might draw users’ attention.

[Mar 2017] New paper accepted in IEEE TIP

Authors: Xin Su, Thomas Maugey, Christine Guillemot

Title: Rate-distortion optimized graph-based representation for multiview images with complex camera configurations accepted in IEEE Transactions on Image Processing, 2017

Abstract: Graph-Based Representation (GBR) has recently been proposed for describing color and geometry of multiview video content. The graph vertices represent the color information, while the edges represent the geometry information, i.e., the disparity, by connecting corresponding pixels in two camera views. In this paper, we generalize the GBR to multiview images with complex camera configurations. Compared with the existing GBR, the proposed representation can handle not only horizontal displacements of the cameras but also forward/backward translations, rotations, etc. However, contrary to the usual disparity that is a 2-dimensional vector (denoting horizontal and vertical displacements), each edge in GBR is represented by a one-dimensional disparity. This quantity can be seen as the disparity along an epipolar segment. In order to have a sparse (i.e., easy to code) graph structure, we propose a rate-distortion model to select the most meaningful edges. Hence the graph is constructed with « just enough » information for rendering the given predicted view. The experiments show that the proposed GBR allows high reconstruction quality with lower or equivalent coding rate compared with traditional depth-based representations.

[Feb 2017] New SPL submission

Authors: Thomas Maugey, Xin Su, Christine Guillemot
Title: Reference camera selection for virtual view synthesis, submitted to IEEE Signal Processing Letters
Abstract: View synthesis using image-based rendering algorithms relies on one or more reference images. The latter has to be as close as possible than the virtual view that is generated. The notion of “closeness” is straightforward when the virtual view is parallel to the reference ones. Indeed the geometrical transformation between the cameras is a simple translation, whose amplitude can be naturally measured by a norm metric. However, we show in this paper that when the camera trajectory becomes general (i.e., translation and rotation are involved), no intuitive distance metric exists. In that case, choosing the best reference camera for view synthesis becomes a difficult problem. Some similarity metrics have been proposed in the literature, but they rely on the scene, and are thus complex to calculate. In this paper, we propose a distance metric that only relies on the camera parameters, and that is thus very simple to compute. We then use that distance to formulate and solve a reference camera selection problem in a general camera configuration. The obtained results show that our distance leads to an efficient and accurate choice of the reference views compared to a “naive” euclidian distance between camera parameters.