A Survey on evaluation of summarization methods

Liana Ermakova; Jean-Valère Cossu; Josiane Mothe

doi:10.1016/j.ipm.2019.04.001

Article Dans Une Revue Information Processing and Management Année : 2019

A Survey on evaluation of summarization methods

(1) , (2) , (3)

1
2
3

Liana Ermakova

Fonction : Auteur
PersonId : 19494
IdHAL : liana-ermakova
ORCID : 0000-0002-7598-7474
IdRef : 224767305

Héritages et Constructions dans le Texte et l'Image

Jean-Valère Cossu

Fonction : Auteur
PersonId : 957209

Laboratoire Informatique d'Avignon

Josiane Mothe

Fonction : Auteur
PersonId : 735149
IdHAL : josianemothe
ORCID : 0000-0001-9273-2193
IdRef : 087097222

Systèmes d’Informations Généralisées

Résumé

The increasing volume of textual information on any topic requires its compression to allow humans to digest it. This implies detecting the most important information and condensing it. These challenges have led to new developments in the area of Natural Language Processing (NLP) and Information Retrieval (IR) such as narrative summarization and evaluation methodologies for narrative extraction. Despite some progress over recent years with several solutions for information extraction and text summarization, the problems of generating consistent narrative summaries and evaluating them are still unresolved. With regard to evaluation, manual assessment is expensive, subjective and not applicable in real time or to large collections. Moreover, it does not provide re-usable benchmarks. Nevertheless, commonly used metrics for summary evaluation still imply substantial human effort since they require a comparison of candidate summaries with a set of reference summaries. The contributions of this paper are three-fold. First, we provide a comprehensive overview of existing metrics for summary evaluation. We discuss several limitations of existing frameworks for summary evaluation. Second, we introduce an automatic framework for the evaluation of metrics that does not require any human annotation. Finally, we evaluate the existing assessment metrics on a Wikipedia data set and a collection of scientific articles using this framework. Our findings show that the majority of existing metrics based on vocabulary overlap are not suitable for assessment based on comparison with a full text and we discuss this outcome.

Mots clés

automatic summarization text compression evaluation campaigns assessment metrics extraction extractive summarization ROUGE

Domaines

Littératures Sciences de l'Homme et Société Informatique [cs]

Fichier principal

S0306457318306241.pdf (1.97 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Accord Elsevier CCSD : Connectez-vous pour contacter le contributeur

https://hal.univ-brest.fr/hal-02130700

Soumis le : lundi 25 octobre 2021-12:34:30

Dernière modification le : lundi 20 novembre 2023-11:44:23

Archivage à long terme le : mercredi 26 janvier 2022-19:51:14

Dates et versions

hal-02130700 , version 1 (25-10-2021)

Licence

Paternité - Pas d'utilisation commerciale

Identifiants

HAL Id : hal-02130700 , version 1
DOI : 10.1016/j.ipm.2019.04.001

Citer

Liana Ermakova, Jean-Valère Cossu, Josiane Mothe. A Survey on evaluation of summarization methods. Information Processing and Management, 2019, 56 (5), pp.1794-1814. ⟨10.1016/j.ipm.2019.04.001⟩. ⟨hal-02130700⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-BREST UNIV-AVIGNON UNIV-TLSE2 CNRS HCTI UT1-CAPITOLE LIA IBSHS IRIT IRIT-SIG IRIT-GD IRIT-UT2J TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

689 Consultations

286 Téléchargements

A Survey on evaluation of summarization methods

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager