No. 057 Towards Explanation Production Combining Natural Language Processing and Logical Reasoning

Icon

NII Shonan Meeting Seminar 057

Shonan-EXPCOLL 2014 Presentations

November 27 (Thu) 11:00-12:00

Robert Kowalski
Computational Logic, the Language of Thought and Natural Language
Formal Logic is a natural candidate for representing computer-intelligible
knowledge extracted from natural language texts on the WWW. I will argue
that the logic of natural language texts is normally not visible, but is
hidden beneath the surface, and that it can be uncovered more easily by
studying texts that are designed to be as clear and easy to understand as
possible.
I will support my argument in two ways: by giving examples of English
language texts and their hidden logic, and by interpreting guidelines for
English writing style in computational logic terms. I will also argue that
kind of logic that is most useful for representing natural language is both
simpler and richer than classical logic. It is simpler because it has a
simpler syntax in the form of conditionals, and it is more powerful because
it distinguishes between the logic of beliefs and the logic of goals.
Keywords: Computational Logic, English writing style, Knowledge
representation
Slides

Bart Verheij
Arguments for Structured Hypotheses: A Logico-Probabilistic Perspective

Some questions have answers with a simple structure. For instance, the
question “What is Vincent van Gogh’s country of birth?” has the answer “The
Netherlands”. Other questions require answers with a more elaborate
structure. For instance, although the question “Is the suspect guilty of a
crime?” can be answered with a simple “yes” or “no”, additional structure in
the form of arguments for and against the legally relevant scenarios is
needed. Each scenario provides a hypothetical answer to the guilt question.
Some scenarios are better supported by the evidence than others. In the
talk, a theory of arguments for structured hypotheses is discussed that uses
classical logic and probability theory as normative framework. Possible
answers to questions take the form of structured hypotheses. The
logico-probabilistic argumentation framework sheds new light on the formal
semantics of argumentation, in a way that combines logic-based knowledge
technology with probability-based data analysis.
Keywords: Argumentation, Inference to the Best Explanation, Artificial
Intelligence and Law, Combining Logic and Probability Theory
Slides

November 27 (Thu) 13:30-15:30

Bernardo Magnini, Ido Dagan, Guenter Neumann, Sebastian Pado
EXCITEMENT: EXploring Customer Interactions through Textual EntailMENT

EXCITEMENT (http://www.excitement-project.eu) is a 3-year research project
funded by the European Commission. The main topic of the project is
identifying semantic inferences between text units, a major language
processing task, needed in practically all text understanding applications.
On the industrial side, EXCITEMENT is focused on the text analytics market
and follows the increasing demand for automatically analyzing customer
interactions. A major result of the project is the release of the EXCITEMENT
Open Platform (EOP). The platform aims to automatically check for the
pres-ence of entailment relations among texts. It is based on a modular
architecture and provides support for the development of algorithms that are
language independent to a high degree. The result is an ideal software
environment for experimenting and testing innovative approaches for textual
infer-ences. The EOP is distributed as open source software
(http://hltfbk.github.io/Excitement-Open-Platform/).
Keywords: semantic inferences, textual entailment, inference platform
Slides

Bernardo Magnini
Decomposing Semantic Inferences
Textual Entailment (TE) has been proposed as an applied framework to capture
major semantic inference needs across applications in Computational
Linguistics. We think that crucial progress may derive from a focus on
decomposing the complexity of the TE task into basic phenomena and on their
combination. In this talk, we carry out a deep analysis on TE data sets,
investigating the relations among two relevant aspects of semantic
inferences: the logical dimension, i.e. the capacity of the inference to
prove the conclusion from its premises, and the linguistic dimension, i.e.
the linguistic devices used to accomplish the goal of the inference. We
propose a decomposition approach over TE pairs, where single linguistic
phenomena are isolated in atomic inference pairs, and we show that at this
granularity level the actual correlation between the linguistic and the
logical dimensions of semantic inferences emerges and can be empirically
observed.
Keywords: semantic inferences, textual entailment, linguistic phenomena
Slides

Ido Dagan
Natural Language Knowledge Graphs: Open-IE meets Knowledge Representation

Formal knowledge representation schemes are typically limited to pre-defined
structures and predicates. Conversely, Open Information Extraction (Open-IE)
represents arbitrary propositions occurring in text, but lacks a
consolidating canonical structure and is limited to simple
predicate-argument tuples. I will outline a proposal towards a more powerful
knowledge open representation scheme, which could cover knowledge beyond the
typical scope of pre-specified knowledge representation. First, we propose
extracting complex and implied propositions and abstracting
semantically-relevant information. Second, we propose adding a structure
over the set of extracted propositions via relevant semantic relationships.
We first focus on the textual entailment relation, which consolidates
semantically equivalent propositions and induces a useful
specific-to-general hierarchical structure. I will review initial research
activities along the abovementioned goals.
Keywords: natural language processing, knowledge representation, textual
inference, textual entailment, open information extraction
Slides

Randy Goebel
Identifying the Tradeoffs in Textual Entailment: Deep Representation
versus Shallow Entailment

Much research on natural language understanding and processing (“NLP”) has
focused on how to transform natural language to formal logics, in which case
the problem of text entailment becomes that of logical entailment. Despite a
variety of approaches to the transformation of language to logic, even the
most sophisticated (e.g., Montague’s higher order intensional logics or
Steedman’s combinatory categorial grammar) leave unresolved foundational
challenges like context and dialogue. And though these transformations are
tightly coupled with formal mechanisms for inference, those methods
themselves are often difficult to implement. Current text entailment focuses
on building or learning models of verb cases, concept identification,
summarization, and information extraction. We consider consider some measure
alternatives for the tradeoffs, and whether they are necessarily empirical,
or can exploit some foundational principles of NLP representation theory.
Keywords: entailment,formal,representation,inference
Slides
Summary

November 27 (Thu) 16:00-17:00

Nguyen Le Minh (joint work with Akira Shimazu)
Learning to Parse Legal Sentences to Logical Form Representations
In this talk, we would like to present our framework for dealing with the
problems of understanding legal sentences. Our framework is divided into
three major steps: logical parts recognition, logical part grouping, and
semantic parsing. In the first phase, we model the problem of recognizing
logical parts in law sentences as a multi-layer sequence-learning problem,
and present a CRF-based model to recognize them. In the second phase, we
propose a graph-based method to group logical parts into logical structures.
We consider the problem of finding a subset of complete subgraphs in a
weighted-edge complete graph, where each node corresponds to a logical part,
and a complete subgraph corresponds to a logical structure. For the final
step, we would like to report our recent works and the state of the art
semantic parsing models for general domains. We also discuss the potential
of exploiting current semantic parsing models for simple law sentences.
Keywords: Legal text processing, semantic parsing, logical parts recognition
Slides

Yuji Matsumoto
Parsing Long and Complex Natural Language Sentences
While syntactic analysis of natural language sentences has shown a
remarkable progress in past decades, there are still some hindrances to
further improvement. Sentences in scientific or legal areas often have very
complex structures, mainly due to long coordinate structures and/or complex
sentence patterns. Although most of the current machine learning-based
syntactic parsers use local features to decide phrase structures or
word-to-word dependencies, global features or resources that make use of
long distance dependencies are necessary to handle complicated linguistic
phenomena. In this talk, I will introduce our on-going project to develop
methods and linguistic resources for complex structures, such as
coordination structures and complex syntactic patterns.
Keywords: natural language parsing, coordination structures, complex
sentence patterns
Slides

November 28 (Fri) 09:00-10:30

Chitta Baral
Explanation Producing Combination of NLP and Logical Reasoning through
Translation of Text to KR Formalisms

Abstracts
Our approach to combine NLP and logical reasoning so that it can produce
explanations is based on translating natural language text to appropriate
knowledge representation formalisms. In this talk we will discuss two
approaches. In the first approach, we will present our semantic parser,
available at http://kparser.org , that translates English text to a
knowledge graph that includes ontological and domain knowledge from various
sources. The second approach addresses the concern that depending on
applications one may want or need translations of natural language text to
different knowledge representation formalisms. We will present our NL2KR
platform (available at http://nl2kr.engineering.asu.edu ) that allows the
development of translation systems by giving examples of translations (i.e.,
a training set) and an initial dictionary of words and their meaning given
as lambda calculus expressions. The work that will be presented is done by
Chitta Baral and his students.
Keywords: Natural Language Processing, Semantic Parser, Lambda Calculus
Slides

Kentaro Inui
Modeling “Reading between the Lines” Based on Scalable and Trainable
Abduction and Large-scale Knowledge Acquisition

The idea of modeling semantic and discourse analysis based on logical
abduction goes back to Hobbs et al.(1993)’s influencing work: Interpretation
as Abduction. While the approach has many potential advantages, no prior
work has successfully built abductive models applicable to real-life
problems chiefly due to the lack of knowledge and computational costs.
However, the game is drastically changing. Recent advances in large-scale
knowledge acquisition from the Web may resolve the knowledge bottleneck. We
show that the computational cost of first-order abduction can be
considerably reduced with the liner programming techniques, which then
enables the supervised training of abduction. Given these movements, we
believe that a number of intriguing issues will emerge from the resumption
of the study for abduction-based modeling of NLP tasks with a fast reasoner
and large-scale knowledge resources. We will present recent insights gained
from experiments on the Winograd Schema Challenge.
Keywords: discourse analysis, abduction, knowledge acquisition, coreference
resolution, Winograd Schema Challenge
Slides

Ken Satoh
Towards Explanation Production of Yes/No Questions in Multiple Choice
Bar Exam

We will present an approach toward explanation production for
multiple-choice bar exam where we give an explanation why a branch in
multiple choices is followed by the articles or precedents in civil law. We
use PROLEG (PROLOG-based LEGal reasoning system) for a reasoning part which
we developed for reasoning about ultimate fact theory and try to make a
connection between PROLEG predicate and an element in a parsed tree of bar
exam sentences by NLP.
Keywords: Legal Reasoning, PROLEG, logic programming, parsed tree
Slides

November 28 (Fri) 13:30-15:30

Yuzuru Tanaka
How to Promote the R&D on Open Source Watson-like Systems based on
the Combination of Natural Language Processing and Logical Reasoning?

In 2013, JST launched the CREST program on big data applications. It has
picked up six projects during the first two years, and will pick up another
couple of projects next year. As one of the focused research areas for the
call for proposals next year, the program is now planning to pick up the NLP
based knowledge acquisition from published research papers in logical
representation, the construction of a large scale knowledge base, and the
automatic reasoning with this knowledge base for knowledge discovery and
question answering. This research area became stimulated by the success of
IBM Watson in 懼Jeopardy!懼・. This research direction may be considered as a
scalable extension of the dreams of Japanese fifth generation computer
project in 90s with NLP interface. As the program officer of the JST CREST
program, I would like to ask a question on how to promote the R&D on
open source Watson-like systems based on the combination of natural language
processing and logical reasoning.
Keywords: big data applications, open-source Watson like systems, research
promotion, knowledge discovery from published papers
Slides

Akiko Aizawa
Math Formula Search
Mathematical formulae in natural language text sometimes represent
formalized concepts and relations that can be processed by computers.
However, in actual documents, most formulae are expressed as noisy,
ambiguous, and insufficient representations. In the past, we explored how to
deal with such ‘informality’ of formalized and abstracted relations for
efficient semantic search. This reveals to require elaborative analysis of
surrounding natural language text as well as efficient approximate tree
search enhanced with variable unification. In this presentation, we briefly
introduce up-to-date techniques for math formula search and also explore
further research directions to connect such efforts to the manipulation of
semantic structure embedded in natural language text.
Keywords:natural language processing, math formula search, description
extraction, construction of dependency graph
Slides

Yusuke Miyao
Fact Validation by Recognizing Textual Entailment
We will introduce recent research activities on fact validation, which is
also known as true-or-false question answering. Fact validation aims to
prove whether or not a given statement is true, according to a prespecified
set of texts, such as textbooks and Wikipedia, that are supposed to describe
true facts. A shared task on fact validation has been organized in NTCIR,
and its organization scheme and the results of participating systems are
introduced. We will also describe experiments on applying a logic-based
textual entailment system to fact validation, and discuss its advantages and
difficulties.
Keywords: fact validation, true-or-false question answering, textual
entailment recognition
Slides

Akihiro Yamamoto (Joint work with Madori Ikeda)
Identifying Appropriate Concepts for Unknown Words with Formal Concept
Analysis

In natural language processing, extending many thesauri is time-consuming.
In order to overcome this problem, we propose a method with a corpus. We
assume that extending thesauri should be inserting unknown words into them
by finding accurate concepts. We regard the task as classification and use a
concept lattice for it. The method enables us to decrease the time-cost by
avoiding feature selection for each pair of a set of unknown words and a set
of unknown words. More precisely, a concept lattice is generated from only a
set of known words, and each formal concept is given a score with the set.
By experiments using practical thesauri and corpora, we show that our
methods can give more accurate concepts to unknown words than other the
k-nearest algorithm.
Keywords: extending thesauri, classification, formal concept analysis
Slides

November 29 (Sat) 09:30-10:30

Erich Schweighofer
On the Way to Semantic Legal Knowledge Systems
paper
Keywords: Semantics, Legal Ontologies, Logic
Slides

Sadao Kurohashi
Knowledge-Intensive Structural NLP in the Era of Big Data
Texts are the basis of human knowledge representation, including data
analysis results and interpretation by experts, criticism and opinions,
procedures and instructions. We have been working on the realization of
knowledge-intensive structural NLP which can extract truly valuable
knowledge for human beings from an ever growing volume of texts, known
recently as Big Data. This talk introduces several of our on-going projects
concerning knowledge-intensive structural NLP: synonymous expression, case
frame and event relation acquisition from 15G parsed sentences, ellipsis
resolution considering exophora and author/reader information, an open
search engine infrastructure TSUBAKI, and an information analysis system
WISDOM.
Keywords: Knowledge-Intensive Structural NLP, Case Frame Acquisition, Event
Relation Acquisition
Slides

November 30 (Sun) 09:00-10:30

Guenter Neumann
Interactive Text Exploration
Today’s Web search is still dominated by a document perspective: a user
enters keywords that represent the information of interest and receives a
ranked list of documents. This technology has been shown to be very
successful, because it very often delivers concrete web pages that contain
the information the user is interested in.
If the user only has a vague idea of the information in question or just
wants to explore the information space, the current search engine paradigm
does not provide enough assistance. The user has to read through the
documents and eventually reformulate the query for finding new information.
So seen, current search engines seem to be best suited for “one-shot
search” and do not support content-oriented interaction. In my talk, I will
present and discuss our efforts in building highly dynamic, scalable
interactive intelligent text content exploration strategies which supports
both, the computer and the human, to interactively “talking about
something懼懼.
Keywords: information search, open information extraction, interactive text
exploration
Slides

Satoshi Tojo
Agent Comunication and Belief Change
The communication in agents is not a simple message passing. A rational
agent should send logically consistent contents in the situation. Then,
there must be a communication channel between agents, e.g., an address of
the message recipient. Furthermore, the message can be publicly announced,
i.e., there can be simultaneous multiple recipients; otherwise the message
passing becomes a personal communication. Finally, the message recipient
must adequately maintain the consistency of their belief, that is, as a
result of message passing, the recipient must revise his/her belief to be
logically consistent. In this talk, I overview the various researches
concerning logical representation of communication and belief change,
especially in terms of modal logic, where belief change is realized by the
restriction of accessibility to some possible worlds. Thereafter, I show
some applications of the formalization, such as logical puzzles.
Keywords: agent, communication, belief, dynamic epistemic logic
Slides

Yu Asano
Explanation Production with Open Data: Approach for Querying RDF Data by
Using Natural Language

The governments of many nations publish vast amount of open data, such as
statistics and white paper. We introduce these activities of open data and
discuss possibilities to produce explanations by using open data. For
example, an answer on question and answering system “The population is
growing” can be supported by showing observed statistical data. From this
point of view, connecting a natural language expression to its evidence data
is a key method of explanation production. As the first goal for the
connection, we propose a method to inquiry structured data by using natural
language like query expression.
Keywords: Open Data,Explanation Extraction,Resource Description
Framework,Query Language,Natural Language
Slides