TH1.R1.4

The Likelihood Gain of a Language Model as a Metric for Text Summarization

Dana Levin, Alon Kipnis, Reichman University, Israel

Session:
Language Models

Track:
8: Machine Learning

Location:
Ballroom II & III

Presentation Time:
Thu, 11 Jul, 10:45 - 11:05

Session Chair:
Homa Esfahanizadeh,
Abstract
Consider the gain in logarithmic loss (LLG) of a text under a language model (LM) when the text’s summary is provided as a context compared to no summary in the context. This gain has been proposed as a reference-free index of similarity to evaluate the relevance of the summary to the text. We justify this similarity index by showing that it describes the reduction in expected binary codelength when the summary text is provided as side information to a lossless text compression system involving the LM and an entropy encoder. Consequently, under proper normalization, this similarity index leads to a form of the well-studied Normalized Compression Distance (NCD) and thus adheres to a universal measure of information distance. Empirical results show that LLG-based NCD is better correlated with human annotators than gzip-based NCD, although the two are relatively highly correlated with each other. Finally, we empirically show that LLG is affected almost exclusively by tokens associated with the text’s content rather than tokens associated with its structure. This observation supports LLG as a natural and useful index of similarity for evaluating and designing text summarization methods.
Resources