Below is a list of my publications organized by date. You can also find my Google Scholar profile.
2024
-
Self-Compositional Data Augmentation for Scientific Keyphrase Generation.
Maël Houbre, Florian Boudin, Béatrice Daille, Akiko Aizawa.
Joint Conference on Digital Libraries (JCDL).
[arXiv] -
Unsupervised Domain Adaptation for Keyphrase Generation using Citation Contexts.
Florian Boudin, Akiko Aizawa.
Conference on Empirical Methods in Natural Language Processing (EMNLP) - Findings.
[paper, bib, arXiv, code, dataset] -
Automatically Suggesting Diverse Example Sentences for L2 Japanese Learners Using Pre-Trained Language Models.
Enrico Benedetti, Akiko Aizawa, Florian Boudin.
Association for Computational Linguistics (ACL): Student Research Workshop.
[paper, bib, code] -
CASIMIR: A Corpus of Scientific Articles enhanced with Multiple Author-Integrated Revisions.
Léane Jourdan, Florian Boudin, Nicolas Hernandez, Richard Dufour.
Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING).
[paper, bib, arXiv, dataset] -
A Survey of Pre-trained Language Models for Processing Scientific Text.
Xanh Ho, Anh Khoa Duong Nguyen, An Tuan Dao, Junfeng Jiang, Yuki Chida, Kaito Sugimoto, Huy Quoc To, Florian Boudin, Akiko Aizawa.
[github, arXiv]
2023
-
Text revision in Scientific Writing Assistance: A Review.
Léane Jourdan, Florian Boudin, Nicolas Hernandez, Richard Dufour.
International Workshop on Bibliometric-enhanced Information Retrieval (BIR).
[paper, arXiv] -
CASIMIR: un Corpus d’Articles Scientifiques Intégrant les ModIfications et Révisions des auteurs.
Léane Jourdan, Florian Boudin, Nicolas Hernandez, Richard Dufour.
Atelier sur l’Analyse et Recherche de Textes Scientifiques (ARTS).
[paper, bib] -
Classification de relation pour la génération de mots-clés absents.
Maël Houbre, Florian Boudin, Béatrice Daille.
Atelier sur l’Analyse et Recherche de Textes Scientifiques (ARTS).
[paper, bib] -
Projet NaviTerm: navigation terminologique pour une montée en compétence rapide et personnalisée sur un domaine de recherche.
Florian Boudin, Richard Dufour, Béatrice Daille.
Atelier sur l’Analyse et Recherche de Textes Scientifiques (ARTS).
[paper, bib] -
Analyse et indexation de textes scientifiques.
Florian Boudin.
Habilitation à Diriger les Recherches (HDR).
2022
-
A large-scale dataset for biomedical keyphrase generation.
Maël Houbre, Florian Boudin, Béatrice Daille.
International Workshop on Health Text Mining and Information Analysis (LOUHI).
[paper, bib, code, dataset] -
Cross-lingual and Cross-domain Transfer Learning for Automatic Term Extraction from Low Resource Data.
Amir Hazem, Mérieme Bouhandi, Florian Boudin, Beatrice Daille.
Language Resources and Evaluation Conference (LREC).
[paper, bib] -
From Fundamentals to Recent Advances: A Tutorial on Keyphrasification.
Rui Meng, Debanjan Mahata, Florian Boudin.
Half-day tutorial at the European Conference on Information Retrieval (ECIR).
[website] -
Extraction and evaluation of formulaic expressions used in scholarly papers.
Kenichi Iwatsuki, Florian Boudin, Akiko Aizawa.
Expert Systems with Applications.
[paper]
2021
-
Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness.
Florian Boudin, Ygor Gallina.
Conference of the North American Chapter of the Association for Computational Linguistics (NAACL).
[paper, bib, arXiv, code, dataset] -
ACM-CR: A Manually Annotated Test Collection for Citation Recommendation.
Florian Boudin.
Joint Conference on Digital Libraries (JCDL).
[arXiv, dataset]
2020
-
Keyphrase Generation for Scientific Document Retrieval.
Florian Boudin, Ygor Gallina, Akiko Aizawa.
Association for Computational Linguistics (ACL).
[paper, bib, video, code] -
Large-Scale Evaluation of Keyphrase Extraction Models.
Ygor Gallina, Florian Boudin, Béatrice Daille.
Joint Conference on Digital Libraries (JCDL).
[paper, arXiv, code, dataset] -
The DELICES Project: Indexing Scientific Literature Through Semantic Expansion.
Florian Boudin, Béatrice Daille, Evelyne Jacquey, Jian-Yun Nie.
Joint Conference of the Information Retrieval Communities in Europe (CIRCLE).
[paper] -
An Evaluation Dataset for Identifying Communicative Functions of Sentences in English Scholarly Papers.
Kenichi Iwatsuki, Florian Boudin, Akiko Aizawa.
Language Resources and Evaluation Conference (LREC).
[paper, bib, dataset] -
TermEval 2020: TALN-LS2N System for Automatic Term Extraction.
Amir Hazem, Mérieme Bouhandi, Florian Boudin, Beatrice Daille.
6th International Workshop on Computational Terminology (CompuTerm).
[paper, bib]
2019
-
KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents.
Ygor Gallina, Florian Boudin, Béatrice Daille.
International Conference on Natural Language Generation (INLG).
[paper, bib, arXiv, dataset] -
DeFT 2019: Auto-encodeurs, Gradient Boosting et combinaisons de modèles pour l’identification automatique de mots-clés.
Mérième Bouhandi, Florian Boudin, Ygor Gallina.
Défi Fouille de Textes (DEFT).
[paper, bib]
2018
- Unsupervised Keyphrase Extraction with Multipartite Graphs.
Florian Boudin.
Conference of the North American Chapter of the Association for Computational Linguistics (NAACL).
[paper, bib, arXiv, code]
2017
-
Modélisation à base de graphe pour l’indexation en domaines de spécialité.
Adrien Bougouin, Florian Boudin, Béatrice Daille.
Recherche d’information, document et web sémantique.
[paper, bib] -
Présentation et résultats du défi fouille de textes DEFT 2016.
Béatrice Daille, Sabine Barreaux, Adrien Bougouin, Florian Boudin, Damien Cram, Amir Hazem.
Recherche d’information, document et web sémantique.
[paper, bib]
2016
-
How Document Pre-processing affects Keyphrase Extraction Performance.
Florian Boudin, Hugo Mougard and Damien Cram.
Workshop on Noisy User-generated Text (WNUT).
[paper, bib, arXiv, code, dataset] -
pke: an open source python-based keyphrase extraction toolkit.
Florian Boudin.
International Conference on Computational Linguistics (COLING), demonstration papers.
[paper, bib, code] -
Keyphrase Annotation with Graph Co-Ranking.
Adrien Bougouin, Florian Boudin, Béatrice Daille.
International Conference on Computational Linguistics (COLING).
[paper, bib] -
TermITH-Eval: a French Standard-Based Resource for Keyphrase Extraction Evaluation.
Adrien Bougouin, Sabine Barreaux, Laurent Romary, Florian Boudin, Béatrice Daille.
Language Resources and Evaluation Conference (LREC).
[paper, bib, dataset] -
Modélisation unifiée du document et de son domaine pour une indexation par termes-clés libre et contrôlée.
Adrien Bougouin, Florian Boudin, Béatrice Daille.
Traitement Automatique des Langues Naturelles (TALN).
[paper, bib] -
Indexation d’articles scientifiques : Présentation et résultats du défi fouille de textes DEFT 2016.
Béatrice Daille, Sabine Barreaux, Florian Boudin, Adrien Bougouin, Damien Cram, Amir Hazem.
Défi Fouille de Textes (DEFT).
[paper] -
TopicRank en domaines de spécialité : participation du LINA à DEFT 2016.
Adrien Bougouin, Florian Boudin, Béatrice Daille.
Défi Fouille de Textes (DEFT).
[paper]
2015
-
Concept-based Summarization using Integer Linear Programming: From Concept Pruning to Multiple Optimal Solutions.
Florian Boudin, Hugo Mougard, Benoit Favre.
Conference on Empirical Methods in Natural Language Processing (EMNLP).
[paper, bib] -
Reducing Over-generation Errors for Automatic Keyphrase Extraction using Integer Linear Programming.
Florian Boudin.
Workshop on Novel Computational Approaches to Keyphrase Extraction.
[paper, bib] -
LINA: Identifying Comparable Documents from Wikipedia.
Emmanuel Morin, Amir Hazem, Florian Boudin, Elizaveta Loginova-Clouet.
Eighth Workshop on Building and Using Comparable Corpora (BUCC).
[paper, bib]
2014
-
TopicRank : ordonnancement de sujets pour l’extraction automatique de termes-clés.
Adrien Bougouin, Florian Boudin.
Traitement Automatique des Langues.
[paper] -
De quoi parle ce Tweet? Résumer Wikipédia pour contextualiser des microblogs.
Romain Deveaud, Florian Boudin.
The Information - Intelligence - Interaction (I3) Journal.
[paper] -
Label Pre-annotation for Building Non-projective Dependency Treebanks for French.
Ophélie Lacroix, Denis Béchet, Florian Boudin.
Conference on Intelligent Text Processing and Computational Linguistics (CICLing).
[paper] -
Influence des domaines de spécialité dans l’extraction de termes-clés.
Adrien Bougouin, Florian Boudin, Béatrice Daille.
Traitement Automatique des Langues Naturelles (TALN).
[paper, bib]
2013
-
TopicRank: Graph-Based Topic Ranking for Keyphrase Extraction.
Adrien Bougouin, Florian Boudin, Béatrice Daille.
International Joint Conference on Natural Language Processing (IJCNLP).
[paper, bib] -
A Comparison of Centrality Measures for Graph-Based Keyphrase Extraction.
Florian Boudin.
International Joint Conference on Natural Language Processing (IJCNLP).
[paper, bib] -
Keyphrase Extraction for N-best Reranking in Multi-Sentence Compression.
Florian Boudin, Emmanuel Morin.
Conference of the North American Chapter of the Association for Computational Linguistics (NAACL).
[paper, bib, dataset, code] -
TALN Archives : une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.
Florian Boudin.
Traitement Automatique des Langues Naturelles (TALN).
[paper, bib, dataset] -
Construction d’un large corpus écrit libre annoté morpho-syntaxiquement en français.
Nicolas Hernandez, Florian Boudin.
Traitement Automatique des Langues Naturelles (TALN).
[paper, bib] -
Contextualisation automatique de Tweets à partir de Wikipédia.
Romain Deveaud, Florian Boudin.
Conférence en Recherche d’Information et Applications (CORIA).
[paper] -
Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization.
Romain Deveaud, Florian Boudin.
INitiative for the Evaluation of XML Retrieval (INEX).
[paper]
2012
-
Using a Medical Thesaurus to Predict Query Difficulty.
Florian Boudin, Jian-Yun Nie, Martin Dawes.
European Conference on Information Retrieval (ECIR).
[paper, bib] -
Détection et correction automatique d’erreurs d’annotation morpho-syntaxique du French TreeBank.
Florian Boudin, Nicolas Hernandez.
Traitement Automatique des Langues Naturelles (TALN).
[paper, bib] -
LIA/LINA at the INEX 2012 Tweet Contextualization track.
Romain Deveaud, Florian Boudin.
INitiative for the Evaluation of XML Retrieval (INEX).
[paper] -
Participation du LINA à DEFT 2012.
Florian Boudin, Amir Hazem, Nicolas Hernandez, Prajol Shrestha.
Défi Fouille de Textes (DEFT).
[paper, bib]
2011
-
A Graph-based Approach to Cross-language Multi-document Summarization.
Florian Boudin, Stéphane Huet, Juan-Manuel Torres-Moreno.
Conference on Intelligent Text Processing and Computational Linguistics (CICLing).
[paper] -
Utilisation d’un score de qualité de traduction pour le résumé multi-document cross-lingue.
Stéphane Huet, Florian Boudin, Juan-Manuel Torres-Moreno.
Traitement Automatique des Langues Naturelles (TALN).
[paper, bib] -
Correction de césures et enrichissement de requêtes pour la recherche de livres.
Romain Deveaud, Florian Boudin, Eric SanJuan, Patrice Bellot.
Conférence en Recherche d’Information et Applications (CORIA).
[paper] -
LIA at INEX 2010 Book Track.
Romain Deveaud, Florian Boudin, Patrice Bellot.
INitiative for the Evaluation of XML Retrieval (INEX).
[paper, bib]
2010
-
Combining classifiers for robust PICO element detection.
Florian Boudin, Jian-Yun Nie, Joan Bartlett, Roland Grad, Pierre Pluye, Martin Dawes.
BMC Medical Informatics and Decision Making.
[paper, ris] -
Positional Language Models for Clinical Information Retrieval.
Florian Boudin, Jian-Yun Nie, Martin Dawes.
Conference on Empirical Methods in Natural Language Processing (EMNLP).
[paper, bib, dataset] -
Clinical Information Retrieval using Document and PICO Structure.
Florian Boudin, Jian-Yun Nie, Martin Dawes.
Conference of the North American Chapter of the Association for Computational Linguistics (NAACL).
[paper, bib, dataset] -
Improving Medical Information Retrieval with PICO Element Detection.
Florian Boudin, Lixin Shi, Jian-Yun Nie.
European Conference on Information Retrieval (ECIR).
[paper, bib] -
Deriving a test collection for clinical information retrieval from systematic reviews.
Florian Boudin, Jian-Yun Nie, Martin Dawes.
Data and Text Mining in Biomedical Informatics (DTMBIO).
[paper, dataset]
2009
-
Résumé automatique multi-document et indépendance de la langue : une première évaluation en français.
Florian Boudin, Juan-Manuel Torres-Moreno.
Traitement Automatique des Langues Naturelles (TALN).
[paper, bib] -
A Maximization-Minimization Approach for Update Text Summarization.
Florian Boudin, Juan-Manuel Torres-Moreno.
Current Issues in Linguistic Theory: Recent Advances in Natural Language Processing.
[paper]
2008
-
A Scalable MMR Approach to Sentence Scoring for Multi-Document Update Summarization.
Florian Boudin, Marc El-Bèze, Juan-Manuel Torres-Moreno.
International Conference on Computational Linguistics (COLING).
[paper, bib] -
Mixing Statistical and Symbolic Approaches for Chemical Names Recognition.
Florian Boudin, Juan Torres-Moreno, Marc El-Bèze.
Conference on Intelligent Text Processing and Computational Linguistics (CICLing).
[paper] -
An Efficient Statistical Approach for Automatic Organic Chemistry Summarization.
Florian Boudin, Juan-Manuel Torres-Moreno, Patricia Velázquez-Morales..
International Conference on Natural Language Processing (GoTAL)
[paper] -
The LIA Update Summarization system at TAC-2008.
Florian Boudin, Marc El-Bèze, Juan-Manuel Torres-Moreno.
Text Analysis Conference (TAC).
[paper] -
Exploration d’approches statistiques pour le résumé automatique de texte.
Florian Boudin.
Laboratoire Informatique d’Avignon – Université d’Avignon.
[PhD thesis]
2007
-
A Cosine Maximization Minimization approach for User Oriented Multi-Document Update Summarization.
Florian Boudin, Juan-Manuel Torres-Moreno.
Recent Advances in Natural Language Processing (RANLP).
[paper] -
NEO-CORTEX: A Performant User-Oriented Multi-Document Summarization System.
Florian Boudin, Juan Torres Moreno.
Conference on Intelligent Text Processing and Computational Linguistics (CICLing).
[paper] -
The LIA-Thales summarization system at DUC-2007.
Florian Boudin, Benoit Favre, Frederic Béchet, Marc El-Bèze, Laurent Gillard, Juan-Manuel Torres-Moreno.
Document Understanding Conference (DUC).
[paper]
2006
- The LIA-Thales summarization system at DUC-2006.
Benoit Favre, Frederic Béchet, Patrice Bellot, Florian Boudin, Marc El-Beze, Laurent Gillard, Guy Lapalme, Juan-Manuel Torres-Moreno.
Document Understanding Conference (DUC).
[paper]