Inria research scientist
Learning grammars and applications to linguistic modelling of biological sequences
Keywords: Grammatical Inference, Machine Learning, Functions and Structures of Protein Sequences, DNA…
News and highlights:
- New: publication of PPalign, Potts to Potts alignment of protein sequences taking into account residues coevolution, with Hugo Talibart, in BMC Bioinformatics (supplementary material, github)
- Deep learning languages: a key fundamental shift from probabilities to weights? A short “position” paper submitted to Deep Learning and Formal Languages: Building Bridges Home ACL 2019 Workshop. If you know publications or experiments on the fundamental differences between learning weights and probabilities, or simply want to discuss this, please don’t hesitate to drop me a line…
- Slides of my talk at ICGI’18 on “Learning local substitutable context-free languages from positive examples in polynomial time and data by reduction” (with Jacques Nicolas) introducing a general definition of grammars in Reduced Normal Form (RNF) and ReGLiScore, a new algorithm by reduction to learn them efficiently with nice theoretical properties.
- Presentation of my team from an Artificial Intelligence perspective (slides for Artificial Intelligence days organized by Inria RBA and IRISA)
- The chapter Learning the Language of Biological Sequences, has been published with other nice chapters in the book Topics in Grammatical Inference edited by Jeffrey Heinz and José Sempere. Following a talk given for the 10th anniversary of ICGI (ICGI’10), it reviews advances on modeling biological sequences, from Pattern/Motif Discovery to Grammatical Inference, trying to help intuition with practical examples. Feedback is welcome!