François Coste

Inria research scientist, Dyliss team
Inria, Univ Rennes, CNRS, IRISA
F-35000 Rennes
email: francois.coste@inria.fr
tel: (+33) 2 99 84 74 91
ORCID: 0000-0001-9134-6557

Research topic: computational learning of linguistic models and application to biological sequences

Keywords: Grammatical Inference, Machine Learning, Sequences, Functions, Structures, Protein, DNA…

News and highlights:

Our 2021-2022 work with Nicolas Buton and Yann Le Cunff showing the interest of Transformers for the prediction of Enzymes has been accepted: Predicting enzymatic function of protein sequences with attention in Bioinformatics
New publication, where Partial Local Multiple Alignment of Sequences meet Phylogeny: Phylogenetic inference of the emergence of sequence modules and protein-protein interactions in the ADAMTS-TSL family, with Olivier Dennler, Samuel Blanquart, Catherine Belleannée and Nathalie Théret, in PLOS Computational Biology
July 10-13, Rabat, Morocco: ICGI 2023, the conference on research at the intersection of Machine Learning and Formal Language Theory.
Publication of PPalign, Potts to Potts alignment of protein sequences taking into account residues coevolution, with Hugo Talibart, in BMC Bioinformatics (supplementary material, github)
Deep learning languages: a key fundamental shift from probabilities to weights? A short “position” paper submitted to Deep Learning and Formal Languages: Building Bridges Workshop at ACL 2019. If you know publications or experiments on the fundamental differences between learning weights and probabilities, or simply want to discuss this, please don’t hesitate to drop me a line…
Slides of my talk at ICGI’18 on “Learning local substitutable context-free languages from positive examples in polynomial time and data by reduction” (with Jacques Nicolas) introducing a general definition of grammars in Reduced Normal Form (RNF) and ReGLiS_core, a new algorithm by reduction to learn them efficiently with nice theoretical properties.
Presentation of my team from an Artificial Intelligence perspective (slides for Artificial Intelligence days organized by Inria RBA and IRISA)
The chapter Learning the Language of Biological Sequences was published with other nice chapters in the book Topics in Grammatical Inference edited by Jeffrey Heinz and José Sempere. Following a talk given for the 10th anniversary of ICGI (ICGI’10), it reviews advances in modeling biological sequences, from Pattern/Motif Discovery to Grammatical Inference, trying to help intuition with practical examples. Feedback welcome!

Home

Research topic: computational learning of linguistic models and application to biological sequences

News and highlights: