REP

Objectives

The objective of this lecture is twofold. On the one hand, we will study the different tools classically used for representing, processing and editing images. At the crossroad of mathematics and informatics, these tools will be studied under the perspective of classical problems in image processing as for example image inpainting or denoising. On the second hand, we will introdue the basics of deep learning for image processing. In this context, we will present computational models of visual perception (e.g. saliency models) aiming to detect in an automatic manner the most salient areas of an image.

To summarize, during these courses, the students will study how images are handled along the whole image/video processing chain: from the way they are represented until the way they are perceived.

Outline

  • Part 0 – Introduction (pdf)
  • Part 1 – Projection models for perspective and spherical cameras (pdf, exercices, paper)
  • Part 2 – Pixel Organisation and Representation (pdf, paper)
  • Part 3 – Color representation (pdf, paper)
  • Part 4 – Transforms and dictionaries (pdf, paper1, paper2)
  • Part 5 – Filtering (pdf, paper)
  • Part 6 – Advanced Filtering (pdf)
  • Part 7- Basic of deep learning (pdf, paper, questions)
  • Part 8 – Inpainting, super-resolution (pdf, paper)
  • Part 9 – Diffusion models, Geometric Deep Learning (pdf)
  • Part 10 – Perception (pdf, paper)

Dates (Fall 2024)

For the rooms (see ADE).

  • Lecture 1: 16/09 (9h45-11h15)
  • Lecture 2: 16/09 (11h30-13h)
  • Lecture 3: 18/09 (9h45-11h15)
  • Lecture 4: 23/09 (9h45-11h15)
  • Lecture 5: 23/09 (11h30-13h)
  • Lecture 6: 25/09 (9h45-11h15)
  • Lecture 7: 30/09 (9h45-11h15)
  • Lecture 8: 30/09 (11h30-13h)
  • Lecture 9: 02/10 (9h45-11h15)
  • Lecture 10: 07/10 (9h45-11h15)
  • Lecture 11: 07/10 (11h30-13h)
  • Lecture 12: 16/10 (9h45-11h15)
  • Lecture 13: 21/10 (9h45-11h15)
  • Lecture 14: 21/10 (11h30-13h)

Evaluation (Fall 2024)

The final score will be composed of two marks:

  • Oral Exam: 21/10 (16h45-18h15) – a research paper to read, summarize and present
  • Written exam: 07/11 (16h45-18h15) – documents authorized

Research papers

  • HYPERSPECTRAL: Nus, L., Miron, S., Jaillais, B., Moussaoui, S., & David, B. R. I. E. (2020, May). A semi-supervised rank tracking algorithm for on-line unmixing of hyperspectral images. ICASSP 2020
  • HYPERSPECTRAL: Rodarmel, C., Shan, J. (2002). Principal component analysis for hyperspectral image classification. Surveying and Land Information Science, 62(2), 115-122.
  • GRAPH: Yang, C., Cheung, G., & Stankovic, V. (2017). Estimating heart rate and rhythm via 3D motion tracking in depth video. IEEE Transactions on Multimedia, 19(7), 1625-1636.
  • DEEP: Moosavi-Dezfooli, S. M., Fawzi, A., Fawzi, O., & Frossard, P. (2017). Universal adversarial perturbations. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1765-1773)
  • RETARGETING: Greisen, P., Lang, M., Heinzle, S., & Smolic, A. (2012, June). Algorithm and VLSI architecture for real-time 1080p60 video retargeting. In Proceedings of the Fourth ACM SIGGRAPH/Eurographics Conference on High-Performance Graphics (pp. 57-66)
  • DCT+DEEP: Nash, C., Menick, J., Dieleman, S., & Battaglia, P. W. (2021). Generating images with sparse representations. arXiv preprint arXiv:2103.03841.
  • DCT+COLOR: Mukherjee, J., & Mitra, S. K. (2008). Enhancement of color images by scaling the DCT coefficients. IEEE Transactions on Image processing, 17(10), 1783-1794.
  • GRAPH: Huang, W., Bolton, T. A., Medaglia, J. D., Bassett, D. S., Ribeiro, A., & Van De Ville, D. (2018). A graph signal processing perspective on functional brain imaging. Proceedings of the IEEE, 106(5), 868-885.
  • GRAPH + POINT CLOUD: Zeng, J., Cheung, G., Ng, M., Pang, J., & Yang, C. (2019). 3D point cloud denoising using graph Laplacian regularization of a low dimensional manifold model. IEEE Transactions on Image Processing, 29, 3474-3489.
  • GRAPH + WAVELET: Zeng, J., Cheung, G., & Ortega, A. (2017). Bipartite approximation for graph wavelet signal decomposition. IEEE Transactions on Signal Processing, 65(20), 5466-5480.
  • GRAPH + FILTERING: Yang, C., Cheung, G., & Stankovic, V. (2017). Estimating heart rate and rhythm via 3D motion tracking in depth video. IEEE Transactions on Multimedia, 19(7), 1625-1636.
  • GRAPH: Chen, S., Varma, R., Sandryhaila, A., & Kovacevic, J. (2015). Discrete Signal Processing on Graphs: Sampling Theory, IEEE transactions on signal processing, 63(24), 6510-6523.
  • GRAPH: Thanou, D., Chou, P. A., & Frossard, P. (2016). Graph-based compression of dynamic 3D point cloud sequences. IEEE Transactions on Image Processing, 25(4), 1765-1778.
  • GRAPH: Kovnatsky, A., Bronstein, M. M., Bronstein, A. M., Glashoff, K., & Kimmel, R. (2013, May). Coupled quasi-harmonic bases. In Computer Graphics Forum (Vol. 32, No. 2pt4, pp. 439-448). Oxford, UK: Blackwell Publishing Ltd.
  • LIGHT FIELD: Hog, M., Sabater, N., & Guillemot, C. (2017). Superrays for Efficient Light Field Processing. IEEE Journal of Selected Topics in Signal Processing, 11(7), 1187-1199.
  • SALIENCY+DEEP: Tavakoli, H. R., Borji, A., Rahtu, E., & Kannala, J. (2019). DAVE: A Deep Audio-Visual Embedding for Dynamic Saliency Prediction. arXiv preprint arXiv:1905.10693.
  • HDR+DEEP: Endo, Y., Kanamori, Y., & Mitani, J. (2017). Deep reverse tone mapping. ACM Trans. Graph., 36(6), 177-1.
  • 360+DEEP: Monroy, R., Lutz, S., Chalasani, T., & Smolic, A. (2018). Salnet360: Saliency maps for omni-directional images with cnn. Signal Processing: Image Communication, 69, 26-34.
  • SPARSITY: Benoît, L., Mairal, J., Bach, F., & Ponce, J. (2011, June). Sparse image representation with epitomes. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on (pp. 2913-2920). IEEE.
  • GRAPH: Rotondo, I., Cheung, G., Ortega, A., & Egilmez, H. (2015). Designing sparse graphs via structure tensor for block transform coding of images. APSIPA ACS, Hong Kong, China.
  • GRAPH: Egilmez, H. E., Chao, Y. H., Ortega, A., Lee, B., & Yea, S. (2016, September). GBST: Separable transforms based on line graphs for predictive video coding. In Image Processing (ICIP), 2016 IEEE International Conference on (pp. 2375-2379). IEEE.
  • LIGHT FIELD: Frigo, O., & Guillemot, C. (2017, September). Epipolar Plane Diffusion: An Efficient Approach for Light Field Editing. In British Machine Vision Conference (BMVC).
  • FOURIER: Ng, R. (2005, July). Fourier slice photography. In ACM transactions on graphics (TOG) (Vol. 24, No. 3, pp. 735-744). ACM.
  • GRAPH: Zhang, C., Florêncio, D., & Chou, P. A. (2015). Graph signal processing-a probabilistic framework. Microsoft Res., Redmond, WA, USA, Tech. Rep. MSR-TR-2015-31.
  • LIGHT FIELD: Tao, M. W., Hadap, S., Malik, J., & Ramamoorthi, R. (2013). Depth from combining defocus and correspondence using light-field cameras. In Proceedings of the IEEE International Conference on Computer Vision (pp. 673-680).
  • DICTIONARY: Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11(Jan), 19-60.
  • OBJECT DETECTION: Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on (Vol. 1, pp. I-I). IEEE.
  • TEXTURE: Karacan, L., Erdem, E., & Erdem, A. (2013). Structure-preserving image smoothing via region covariances. ACM Transactions on Graphics (TOG), 32(6), 176.
  • SALIENCY+DEEP: Cornia, M., Baraldi, L., Serra, G., & Cucchiara, R. (2016, December). A deep multi-level network for saliency prediction. In Pattern Recognition (ICPR), 2016 23rd International Conference on (pp. 3488-3493). IEEE.
  • SALIENCY+DEEP: Kümmerer, M., Wallis, T. S., & Bethge, M. (2016). DeepGaze II: Reading fixations from deep features trained on object recognition. arXiv preprint arXiv:1610.01563.