HITIQA

Description

This project is part of the three-phase AQUAINT Program. We are developing automated question-answering technology to assist intelligence analysts’ daily activities. Go to HITIQA official website to learn project progress.

HITIQA partners include:

University at Albany (USA)

For more information, please contact:

Prof. Tomek Strzalkowski, Principal Investigator
University at Albany, SUNY
ILS Room 262B, Social Science Building
Albany, NY 12222.
Email: tomek [at] albany [dot] edu
Phone: 518-442-2608; Fax: 518-442-2606

Executive Summary

National security depends critically on accurate, high-quality information being available at the right time to support important policy decisions. A key element of this process is the work of the intelligence analyst, who must (often quickly) produce the right information from a potentially enormous number of sources, reports, documents and databases. Today’s information retrieval technology provides some help, but clearly more is needed.

High-Quality Interactive Question-Answering (HITIQA) technology will allow the analysts and other users of information systems to pose questions in natural language and obtain relevant, factual answers, or the assistance they require in order to perform their tasks. For example, the question “How long does it take to fly from New York to Paris on a Concorde?” would be expected to generate the answer of “The Concorde service to Paris is still suspended – the usual flight time is 3.5 hours.” Similarly, a request “What recent disasters occurred in tunnels used for transportation?” would produce a list or a table of appropriate facts organized according to an analyst’s instructions; while “What was Russia’s reaction to U.S. bombing of Kosovo?” would have a comprehensive report prepared on the issue.

These exchanges will not happen in isolation; in most cases the system must engage the analysts in a dialogue to clarify their intentions and goals, while they navigate visually through multidimensional information space. The information necessary to answer analysts’ requests may be available to the system (although this is by no means guaranteed), but its exact format is not known a priori: it could be a database record, a short text passage, or it could be scattered among many documents; it could be stated explicitly or it may have to be inferred.

The analysts, of course, could find the answers they require by searching the available data using other access means, e.g., using a document retrieval system or database search with structured queries; however, none of these would quite match the convenience and directness of HITIQA.

In HITIQA, information delivered to users is not only relevant, but it’s useful and tailored to the tasks they are performing. Moreover, the information is of the highest quality possible, relative to the user task and needs; it is as timely, reliable, trustworthy, and accurate as it can be, and has a degree of confidence attached.

This project aims to make significant advances in the state of the art of automated question answering by focusing on the following key research issues:

  • Question Semantics: how the system “understands” user requests
  • Human-Computer Dialogue: how the user and the system negotiate this understanding
  • Information Quality Metrics: how some information is better than other
  • Information Fusion: how to assemble the answer that fits user needs

The project will involve several cycles of experiments with users performing a variety of tasks. Empirical data gathered from these experiments will be analyzed to induce models for automated assessment of information quality and for optimizing information fusion. These models will be embedded in the evolving HITIQA prototype. The concept is illustrated in Figure 1. The emerging system will undergo a series of formal evaluations, both user-centered evaluations obtained from the above experiments, and Government mandated quantitative evaluations.

HITIQA Concept & components

This project covers aspects of 3 technical areas identified in the AQUAINT Program:

    1. Question Understanding and Interpretation. Our project covers: methods of determining the exact meaning of analysts’ questions, interacting with the user, refining and clarifying question context, and providing feedback on system’s “understanding” of analysts’ questions through Q&A sessions.
    2. Determining the Answer. Our project covers: advanced retrieval, extraction and fusion of information from multiple sources and “documents”; dealing with incompatible and contradictory information and weighting credibility of sources through information quality assessment.
    3. Formulating and Presenting the Answer. Our project covers methods of fusing information pieces from disparate sources into the sort of coherent, well-formed answer that a human would be expected to compose.

HITIQA addresses substantial, unsolved questions specifically relevant to the larger, more broadly defined goals of the AQUAINT Program, as well as transferability of our algorithms and technical approach to the other four data dimensions in the call. By focusing on the interface with the user, and building adaptive models of the user’s short-term goals and long term preferences, we have an approach that can be transferred to other media, and to multiple collections, with little additional work. In addition, we are attacking the three environmental factors, as described: scalability, analysis and synthesis across multiple documents, and dealing with extreme data situations.

Summary of Project Accomplishments

Intellectual Accomplishments

The AQUAINT program provides a unique environment to understand the information and technology needs of intelligence analysts and to capture this understanding in the system design. We believe HITIQA captures the dynamics of the analytical process to the extent greater than any existing factoid QA system. This is because HITIQA does not simply attempt to find a literal “answer” (whatever that may be); instead, it conducts multi-modal interactive dialogue with the users in order to help them rapidly assemble comprehensive reports addressing complex, analytical problems.

The HITIQA project has produced significant intellectual and scientific advances, including:

  • Analytical question answering for complex, multi-faceted intelligence problems
  • Multi-modal, problem solving human-computer dialogue
  • Rapid extraction and structuring of event information from text
  • Automated classification of content by aspects and information quality
  • Novel methods for evaluation and introduction of advanced technology to the IC.

System Capabilities

HITIQA is a new analytical question answering system that has been built entirely during the AQUAINT program. It is a fully automated interactive question answering system, which combines language-based dialogue and visual navigation of the information space.

HITIQA answers complex analytical questions, explanatory questions, and hypothetical questions – capabilities that are not available in factoid QA systems built before the AQUAINT program began. The answer is delivered in the form of a compact draft report, which includes excerpts of original text sources organized into topically labeled electronic folders.

HITIQA conducts multi-modal dialogue with the user, mediating between the analyst’s information need and the available data in order to converge on the desired answer. The system anticipates and explores related, adjunct information, allowing the analysts to rapidly assemble comprehensive reports. Robust, data-driven semantics for natural language questions allows the system to operate efficiently in both open and specialized domains.

We have developed novel methods for abstract visualization of open-domain information, in such as a way as to support the user in efficient exploration of the answer space. Furthermore, the visualization has been integrated with the language-based dialogue, allowing for multi-modal interaction (Figure 1).

HITIQA Multi-modal Interaction Panel

In addition to returning topical information, HITIQA assesses the quality of information sources and alerts the analyst if the data might be questionable in some respects. For complex topics, the system also classifies information into aspectual facets such as political, military, financial, etc., to further speed up answer comprehension.

HITIQA is a fully implemented prototype system, now available in version 2 (HITIQA-2).

More information about HITIQA can be found in some 20 publications including a recent paper presented at the International Conference on Intelligence Analysis, cited below.

Strzalkowski, T., S. Small, H. Hardy, B. Yamrom, T. Liu, P. Kantor, K.B. Ng, N. Wacholder. HITIQA: A Question Answering Analytical Tool, Proceedings of International Conf. On Intelligence Analysis, McLean, VA. May 2-4, 2005.

IARPA