Automatic text summarization is an important tool for enhancing users’ ability to make decisions in the face of overwhelming amounts of data. Key to making this technology useful and practical is to have high-performing systems that work on a variety of texts and settings. However, existing systems are usually developed and tested on standard research benchmarks based on news texts. In this talk, I discuss how current systems exploit biases in these benchmark tasks in order to perform well without deeply understanding the contents of the input. In particular, they heavily exploit the fact that important sentences tend to appear near the beginning of news articles. I present our lab’s extractive summarization system, BanditSum, which frames summarization as a contextual bandit problem, and our efforts to induce BanditSum to focus on both the position of a sentence and its contents in making content selection decisions, leading to improved summarization performance. Next, I argue that effective summarization requires advances in abstractive summarization, which analyzes the contents of the source texts in order to generate novel summary sentences. However, existing datasets do not require or support the learning of the type of reasoning and generalization which would demonstrate abstraction’s utility. I discuss the ongoing work in my lab in this direction, both from the perspective of analyzing and improving existing abstractive approaches, and from the perspective of developing new datasets and tasks in which abstraction is necessary.
For more info, visit our page:
#SAIT(Samsung Advanced Institute of Technology): http://smsng.co/sait
Source: Samsung Mobile YouTube