• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Book chapter

Improving Text Retrieval Efficiency with Pattern Structures on Parse Thickets

P. 6-21.

We develop a graph representation and learning technique for parse  structures for paragraphs of text. We introduce Parse Thicket (PT) as a sum of  syntactic parse trees augmented by a number of arcs for inter-sentence word-word relations such as co-reference and taxonomic relations. These arcs are also derived from other sources, including Speech Act and Rhetoric Structure theories.  The operation of generalizing logical formulas is extended towards parse trees  and then towards parse thickets to compute similarity between texts. We provide  a detailed illustration of how PTs are built from parse trees, and generalized. The  proposed approach is subject to preliminary evaluation in the product search domain of eBay.com, where user queries include product names, features and expressions for user needs, and query keywords occur in different sentences of an  answer. We demonstrate that search relevance is improved by PT generalization.  

In book

Edited by: S. Kuznetsov, C. Carpineto, A. Napoli. Vol. 977. M.: CEUR Workshop Proceedings, 2013.