Authors: Xin Su, Thomas Maugey, Christine Guillemot
Title: Rate-distortion optimized graph-based representation for multiview images with complex camera configurations, accepted in IEEE Transactions on Image Processing, 2017
Abstract: Graph-Based Representation (GBR) has recently been proposed for describing color and geometry of multiview video content. The graph vertices represent the color information, while the edges represent the geometry information, i.e., the disparity, by connecting corresponding pixels in two camera views. In this paper, we generalize the GBR to multiview images with complex camera configurations. Compared with the existing GBR, the proposed representation can handle not only horizontal displacements of the cameras but also forward/backward translations, rotations, etc. However, contrary to the usual disparity that is a 2-dimensional vector (denoting horizontal and vertical displacements), each edge in GBR is represented by a one-dimensional disparity. This quantity can be seen as the disparity along an epipolar segment. In order to have a sparse (i.e., easy to code) graph structure, we propose a rate-distortion model to select the most meaningful edges. Hence the graph is constructed with « just enough » information for rendering the given predicted view. The experiments show that the proposed GBR allows high reconstruction quality with lower or equivalent coding rate compared with traditional depth-based representations.
Authors: Thomas Maugey, Xin Su, Christine Guillemot
Title: Reference camera selection for virtual view synthesis, submitted to IEEE Signal Processing Letters
Abstract: View synthesis using image-based rendering algorithms relies on one or more reference images. The latter has to be as close as possible than the virtual view that is generated. The notion of « closeness » is straightforward when the virtual view is parallel to the reference ones. Indeed the geometrical transformation between the cameras is a simple translation, whose amplitude can be naturally measured by a norm metric. However, we show in this paper that when the camera trajectory becomes general (i.e., translation and rotation are involved), no intuitive distance metric exists. In that case, choosing the best reference camera for view synthesis becomes a difficult problem. Some similarity metrics have been proposed in the literature, but they rely on the scene, and are thus complex to calculate. In this paper, we propose a distance metric that only relies on the camera parameters, and that is thus very simple to compute. We then use that distance to formulate and solve a reference camera selection problem in a general camera configuration. The obtained results show that our distance leads to an efficient and accurate choice of the reference views compared to a « naive » euclidian distance between camera parameters.
Two papers have been submitted to ICIP 2017:
Authors: Navid Mahmoudian Bidgoli, Thomas Maugey, Aline Roumy
Title: Correlation Model Selection for interactive video communication
Abstract: Interactive video communication has been recently proposed for multi-view videos. In this scheme, the server has to store the views as compact as possible, while being able to transmit them independently to the users, who are allowed to navigate interactively among the views, hence requesting a subset of them. To achieve this goal, the compression must be done using a model-based coding in which the correlation between the predicted view generated on the user side and the original view has to be modeled by a statistical distribution. In this paper we propose a framework for lossless coding to select a model among a candidate set of models that incurs the lowest extra rate cost to the system. Moreover, in cases where the depth image is available, we provide a method to estimate the correlation model.
Authors: Xin Su, Mira Rizkallah, Thomas Maugey, Christine Guillemot
Title: Graph-based light fields representation and coding using geometry information (webpage)
Abstract: This paper describes a graph-based coding scheme for light fields (LF). It first adapts graph-based representations (GBR) to describe color and geometry information of LF. Graph connections describing scene geometry capture inter-view dependencies. They are used as the support of a weighted Graph Fourier Transform (wGFT) to encode disoccluded pixels. The quality of the LF reconstructed from the graph is enhanced by adding extra color information to the representation for a sub-set of sub-aperture images. Experiments show that the proposed scheme yields rate-distortion gains compared with HEVC based compression (directly compressing the LF as a video sequence by HEVC).
The following paper have been accepted for presentation at EUSIPCO 2016 in Budapest, Hungary.
Authors: M. Rizkallah, T. Maugey, C. Yaacoub, C. Guillemot
Title: Impact of Light Field Compression on Focus Stack and Extended Focus Images
Abstract: Light Fields capturing all light rays at every point in space and in all directions contain very rich information about the scene. This rich description of the scene enables advanced image creation capabilities, such as re-focusing or extended depth of field from a single capture. But, it yields to a very high volume of data which needs compression. This paper studies the impact of Light Fields compression on two key functionalities: refocusing and extended focus. The sub-aperture images forming the Light Field are compressed as a video sequence with HEVC. A focus stack and the scene depth map are computed from the compressed light field and are used to render an image with an extended depth of field (called the extended focus image). It has been first observed that the Light Field could be compressed with a factor up to 700 without significantly affecting the visual quality of both refocused and extended focus images. To further analyze the compression effect, a dedicated quality evaluation method based on contrast and gradient measurements is considered to differentiate the natural geometrical blur from the blur resulting from compression. As a second part of the experiments, it is shown that the texture distortion of the in-focus regions in the focus stacks is the main cause of the quality degradation in the extended focus and that the depth errors do not impact the extended focus quality unless the light field is significantly distorted with a compression ratio of around 2000:1.