Data

Temporal Dependency Trees in Children’s Stories

As part of the TERENCE project, we annotated a small corpus of children’s fables with temporal dependency trees, that is, where timelines are annotated as tree-structured graphs of temporal links between events. More details about the annotation process and temporal dependency parsing models are available in:

  • Steven Bethard, Oleksandr Kolomiyets, Marie-Francine Moens (2012): Annotating Story Timelines as Temporal Dependency Structures. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), pp. 2721–2726, 2012.
  • Oleksandr Kolomiyets, Steven Bethard, Marie-Francine Moens (2012): Extracting Narrative Timelines as Temporal Dependency Structures. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 88–97, Association for Computational Linguistics, Jeju Island, Korea, 2012.

Re-annotated TempEval 2010 Time Expressions

In working with the TempEval 2010 time expression recognition task, we found that a few time expressions, such as such as 23-yeara few days and third-quarter were missed by the annotators in the official test data. Our re-annotated version of the TempEval time expression test data can be used as a drop-in replacement to the original data. More details about the re-annotation process and reasons to use this data instead of the original are available in:

  • Oleksandr Kolomiyets, Steven Bethard, Marie-Francine Moens (2011): Model-Portability Experiments for Textual Temporal Analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 271–276, 2011.

Conjoined-Event Temporal and Causal Relations

We annotated a small corpus of conjoined-event temporal-causal relations. For example, given the sentence:

Fuel tanks had leaked and contaminated the soil.

we annotated the relations (leaked BEFORE contaminated) and (leaked CAUSED contaminated). The corpus includes 1000 pairs of events taken from the Wall Street Journal, with each event pair assigned both a temporal and a causal relation. More details about the annotation process and models for predicting such relations are available in:

  • Steven Bethard, William Corvey, Sara Klingenstein, James H. Martin (2008): Building a Corpus of Temporal-Causal Structure. In: Language Resources and Evaluation Conference (LREC), pp. 908–915, 2008.
  • Steven Bethard, James H. Martin (2008): Learning Semantic Links from a Corpus of Parallel Temporal and Causal Relations. In: Proceedings of ACL-08: HLT, Short Papers, pp. 177–180, 2008.

Verb-Clause Temporal Relations

We annotated a small corpus of verb-clause temporal relations. For example, given the sentence:

International Business Machines Corp. and Compaq Computer Corp. say the bugs willdelay products.

we annotated the temporal relation (say BEFORE delay). The corpus includes 895 such pairs, taken from the Wall Street Journal section of the TimeBank. More details about the data and annotation process are available in:

  • Steven Bethard, James H. Martin, Sara Klingenstein (2007): Finding Temporal Structure in Text: Machine Learning of Syntactic Temporal Relations. In: International Journal of Semantic Computing (IJSC), 1 (4), pp. 441–458, 2007.

Opinions and Opinion Holders

We annotated opinions and opinion holders for verbs with clausal complements in both FrameNet data and PropBank data. For example, given the sentence:

Still, Vista officials realize they’re relatively fortunate.

we annotated Vista officials as the opinion holder, and they’re relatively fortunate as the opinion. More details about the data and the models we built from it are available in:

  • Steven Bethard, Hong Yu, Ashley Thornton, Vasileios Hatzivassiloglou, Dan Jurafsky (2005): Extracting opinion propositions and opinion holders using syntactic and lexical cues. In: Shanahan, James; Qu, Yan; Wiebe, Janyce (Ed.): Computing Attitude and Affect in Text: Theory and Applications, pp. 125–141, Springer, 2005.

The corpora on which this annotation was performed have changed somewhat since this work was done, but I’ve managed to mostly re-align the data. The PropBank data should be almost identical to that used in the paper, while the FrameNet data is about 50 sentences smaller, since some of the sentences we annotated are missing in the newer versions of FrameNet.