In our lab meeting tommorow, Anoop will talk about a new perspective in summarizing a large amount of text and how we can visualize it. here’s the abstract of talk:
In natural language processing, the summarization of information in a large amount of text has typically been viewed as a type of natural language generation problem, e.g. “produce a 250 word summary of some documents based on some input query”. An alternative view, which will be the focus of this talk, is to use natural language parsing to extract facts from a collection of documents and then use information visualization to provide an interactive summarization of these facts.
The first step is to extract detailed facts about events from natural language text using a predicate-centered view of events (who did what to whom, when and how). We exploit semantic roles in order to create a predicate-centric ontology for entities which is used to create a knowledge base of facts about entities and their relationship with other entities.
The next step is to use information visualization to provide a summarization of the facts in this automatically extracted knowledge base. The user can interact with the visualization to find summaries that have different granularities. This enables the discovery of extremely uncommon facts easily.
We have used this methodology to build an interactive visualization of events in human history by machine reading Wikipedia articles. I will demo the visualization and describe the results of a user study that evaluates this interactive visualization for a summarization task.
Thursday, Oct 26, 10-11 AM, Location: TASC1 9408.