TREC Knowledge Base Acceleration

Supporters:

What is a knowledge base and why accelerate them?

Here is an example of an entity from Wikipedia that illustrates several properties that make KBA research interesting:

KBA is related to several existing research activities in text analytics and text retrieval, including entity linking, relation extraction, knowledge base population, and topic detection & tracking. KBA combines elements of these lines of thinking by asking researchers to invent systems to participate in the human-driven process of assimilating information into a knowledge base (KB), like Wikipedia (WP) or Freebase (FB).

Incoming streams of new information are so large that even if a content routing engine perfectly connected each piece of inbound content with appropriate human curators and KB nodes, the humans would still fall behind. Thus, a routing system must actually run open loop without human feedback for extended periods of time accumulating evidence about entities in the KB.

The KBA track is a forum for examining issues related to creating such systems, including:

To begin studying these issues, we are generating a time-stamped corpus of Web, news, and social media over a multi-month period. We are in the process of creating training & evaluation data by manually labeling this corpus with passage selections associated with KB nodes. In future years, we may consider other knowledge bases and streams, such as the stream of new articles in PubMed and KBs about proteins.

For the first year of KBA (TREC 2012), we conducted a simple filtering task and had 11 teams submit runs from 43 algorithmic approaches. KBA is continuing in TREC 2013.