• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Book chapter

Text Classification into Abstract Classes Based on Discourse Structure

P. 201-207.
Galitsky B., Ilvovsky D., Kuznetsov S.

The problem of classifying text with respect to belonging to a document or a metadocument is formulated and its application areas are proposed. An algorithm is proposed for document classification tasks where counts of words isinsufficient do differentiate between such abstract classes of text as metalanguage and object-level. We extend the parse tree kernel method from the level of individual sentences towards the level of paragraphs, based on anaphora, rhetoric structure relations and communicative actions linking phrases in different sentences. Tree kernel learning technique is applied to these extended trees to leverage of additional discourse-related information. We evaluate our approach in the domain of action-plan documents.

In book

Edited by: G. Angelova, K. Boncheva, R. Mitkov. Hissar: 2015.