[Jul 21] New TCSVT accepted

Title: Immersive Video Coding: Should Geometry Information be Transmitted as Depth Maps?

Authors: P. Garus, F. Henry, J. Jung, T. Maugey, C. Guillemot

Abstract: Immersive video often refers to multiple views with texture and scene geometry information, from which different viewports can be synthesized on the client side. To design efficient immersive video coding solutions, it is desirable to minimize bitrate, pixel rate and complexity. We investigate whether the classical approach of sending the geometry of a scene as depth maps is appropriate to serve this purpose. Previous work shows that bypassing depth transmission entirely and estimating depth at the client side improves the synthesis performance while saving bitrate and pixel rate. In order to understand if the encoder side depth maps contain information that is beneficial to be transmitted, we first explore a hybrid approach which enables partial depth map transmission using a block-based RD-based decision in the depth coding process.
This approach reveals that partial depth map transmission may improve the rendering performance but does not present a good compromise in terms of compression efficiency. This led us to address the remaining drawbacks of decoder side depth estimation: complexity and depth map inaccuracy. We propose a novel system that takes advantage of high quality depth maps at the server side by encoding them into lightweight features that support the depth estimator at the client side. These features allow reducing the amount of data that has to be handled during decoder side depth estimation by 88%, which significantly speeds up the cost computation and the energy minimization of the depth estimator. Furthermore, -46.0% and -37.9% average synthesis BD-Rate gains are achieved compared to the classical approach with depth maps estimated at the encoder.

[May 21] New TIP journal accepted

Title: Rate-Distortion Optimized Graph Coarsening and Partitioning for Light Field Coding

Authors: M. Rizkallah, T. Maugey, C. Guillemot

Abstract: Graph-based transforms are powerful tools for signal representation and energy compaction. However, their usefor high dimensional signals such as light fields poses obvious problems of complexity. To overcome this difficulty, one canconsider local graph transforms defined on supports of limited dimension, which may however not allow us to fully exploit long-term signal correlation. In this paper, we present methods to optimize local graph supports in a rate distortion sense for efficient light field compression. A large graph support can be well adapted for compression efficiency, however at the expense of high complexity. In this case, we use graph reduction technique sto make the graph transform feasible. We also consider spectral clustering to reduce the dimension of the graph supports while controlling both rate and complexity. We derive the distortion andrate models which are then used to guide the graph optimization.We describe a complete light field coding scheme based onthe proposed graph optimization tools. Experimental results show rate-distortion performance gains compared to the useof fixed graph support. The method also provides competitive results when compared against HEVC-based and the JPEG Plenolight field coding schemes. We also assess the method against a homography-based low rank approximation and a Fourier disparity layer based coding method

[Jan 21] New ICASSP paper accepted

Title: Rate-distortion optimized motion estimation for on-the-sphere compression of 360 videos

Authors: Alban Marie, Navid Mahmoudian Bidgoli, Thomas Maugey, Aline Roumy

Abstract: On-the-sphere compression of omnidirectional videos is a very promising approach. First, it saves computational complexity as it avoids to project the sphere onto a 2D map, as classically done. Second, and more importantly, it allows to achieve a better rate-distortion tradeoff, since neither the visual data nor its domain of definition are distorted. In this paper, the on-the-sphere compression for omnidirectional still images is extended to videos. We first propose a complete review of existing spherical motion models. Then we propose a new one called tangent-linear+t. We finally propose a rate-distortion optimized algorithm to locally choose the best motion model for efficient motion estimation/compensation. For that purpose, we additionally propose a finer search pattern, called spherical-uniform, for the motion parameters, which leads to a more accurate block prediction. The novel algorithm leads to rate-distortion gains compared to methods based on a unique motion model.


[Jan 21] New IEEE Com letter accepted

Title: Bit-Plane Coding in Extractable Source Coding: optimality, modeling, and application to 360° data

Authors: Fangping Ye, Navid Mahmoudian Bidgoli, Elsa Dupraz, Aline Roumy, Karine Amis, Thomas Maugey

In extractable source coding, multiple correlated sources are jointly compressed but can be individually accessed in the compressed domain. Performance is measured in terms of storage and transmission rates. This problem has multiple applications in interactive video compression such as Free Viewpoint Television or navigation in 360° videos. In this paper, we analyze and improve a practical coding scheme. We consider a binarized coding scheme, which insures a low decoding complexity. First, we show that binarization does not impact the transmission rate but only slightly the storage with respect to a symbol based approach. Second, we propose a Q-ary symmetric model to represent the pairwise joint distribution of the sources instead of the widely used Laplacian model. Third, we introduce a novel pre-estimation strategy, which allows to infer the symbols of some bit planes without any additional data and therefore permits to reduce the storage and transmission rates. In the context of 360° images, the proposed scheme allows to save 14\% and 34\% bitrate in storage and transmission rates respectively.


[Sep 20] New IEEE SP Letter accepted

Title: Large Database Compression Based on Perceived Information

Authors: Thomas Maugey and Laura Toni

Abstract: Lossy compression algorithms trade bits for quality, aiming at reducing as much as possible the bitrate needed to represent the original source (or set of sources), while preserving the source quality. In this letter, we propose a novel paradigm of compression algorithms, aimed at minimizing the information loss perceived by the final user instead of the actual source quality loss, under compression rate constraints.
As main contributions, we first introduce the concept of perceived information (PI), which reflects the information perceived by a given user experiencing a data collection, and which is evaluated as the volume spanned by the sources features in a personalized latent space.
We then formalize the rate-PI optimization problem and propose an algorithm to solve this compression problem. Finally, we validate our algorithm against benchmark solutions with simulation results, showing the gain in taking into account users’ preferences while also maximizing the perceived information in the feature domain.


[Aug 20] New journal article accepted in Annals of Telecommunications

Title: Excess rate for model selection in interactive compression using Belief-propagation decoding

Authors: Navid Mahmoudian-Bidgoli, Thomas Maugey, Aline Roumy

Abstract: Interactive compression refers to the problem of compressing data while sending only the part requested by the user. In this context, the challenge is to perform the extraction in the compressed domain directly. Theoretical results exist, but they assume that the true distribution is known. In practical scenarios instead, the distribution must be estimated. In this paper, we first formulate the model selection problem for interactive compression and show that it requires to estimate the excess rate incurred by mismatched decoding. Then, we propose a new expression to evaluate the excess rate of mismatched decoding in a practical case of interest: when the decoder is the belief-propagation algorithm. We also propose a novel experimental setup to validate this closed-form formula. We show a good match for practical interactive compression schemes based on fixed-length Low-Density Parity-Check (LDPC) codes. This new formula is of great importance to perform model and rate selection.

[Jun 20] New journal article accepted in IEEE TMM

Title: Fine granularity access in interactive compression of 360-degree images based on rate adaptive channel codes

Authors: N. Mahmoudian-Bidgoli, T. Maugey, A. Roumy

Abstract: In this paper, we propose a new interactive compression scheme for omnidirectional images. This requires two characteristics: efficient compression of data, to lower the storage cost, and random access ability to extract part of the compressed stream requested by the user (for reducing the transmission rate). For efficient compression, data needs to be predicted by a series of references that have been pre-defined and compressed. This contrasts with the spirit of random accessibility. We propose a solution for this problem based on incremental codes implemented by rate adaptive channel codes. This scheme encodes the image while adapting to any user request and leads to an efficient coding that is flexible in extracting data depending on the available information at the decoder. Therefore, only the information that is needed to be displayed at the users side is transmitted during the user’s request as if the request was already known at the encoder. The experimental results demonstrate that our coder obtains better transmission rate than the state-of-the-art tile-based methods at a small cost in storage. Moreover, the transmission rate grows gradually with the size of the request and avoids a staircase effect, which shows the perfect suitability of our coder for interactive transmission.

[Jun 20] New journal article accepted in IEEE T-Com

Title: Optimal reference selection for random access in predictive coding schemes

Authors: M. Q. Pham, A. Roumy, T. Maugey, E. Dupraz, M. Kieffer

Abstract: Data acquired over long periods of time like High Definition (HD) videos or records from a sensor over long time intervals, have to be efficiently compressed, to reduce their size. The compression has also to allow efficient access to random parts of the data upon request from the users. Efficient compression is usually achieved with prediction between data points at successive time instants. However, this creates dependencies between the compressed representations, which is contrary to the idea of random access. Prediction methods rely in particular on reference data points, used to predict other data points, and the placement of these references balances compression efficiency and random access. Existing solutions to position the references use ad hoc methods. In this paper, we study this joint problem of compression efficiency and random access. We introduce the storage cost as a measure of the compression efficiency and the transmission cost for the random access ability. We show that the reference placement problem that trades off storage with transmission cost is an integer linear programming problem, that can be solved by standard optimizer. Moreover, we show that the classical periodic placement of the references is optimal, when the encoding costs of each data point are equal and when requests of successive data points are made. In this particular case, a closed form expression of the optimal period is derived. Finally, the optimal proposed placement strategy is compared with an ad hoc method, where the references correspond to sources where the prediction does not help reducing significantly the encoding cost. The optimal proposed algorithm shows a bit saving of -20% with respect to the ad hoc method.

[Feb 19] New journal article accepted in IEEE TSIPN

Title: Incremental coding for extractable compression in
the context of Massive Random Access

Autors: Thomas Maugey, Aline Roumy, Elsa Dupraz, Michel Kieffer

Abstract: in this paper, we study the problem of source coding with Massive Random Access (MRA). A set of correlated sources is encoded once for all and stored on a server while a large number of clients access various subsets of these sources. Due to the number of concurrent requests, the server is only able to extract a bitstream from the stored data: no re-encoding can be performed before the transmission of the data requested by the clients.
First, we formally define the MRA framework and propose to model the constraints on the way subsets of sources may be accessed by a navigation graph. We introduce both storage and transmission costs to characterize the performance of MRA. We then propose an Incremental coding Based Extractable Compression (IBEC) scheme. We first show that this scheme is optimal in terms of achievable storage and transmission costs. Second, we propose a practical implementation of our IBEC scheme based on rate-compatible LDPC codes. Experimental results show that our IBEC scheme can almost reach the same transmission costs as in traditional point-to-point source coding schemes, while having a reasonable overhead in terms of storage cost.