Проблемы обработки естественного языка в диалоговых системах
Nowadays, a field of dialogue systems and conversational agents is one of the rapidly growing research areas in artificial intelligence applications. Business and industry are showing increasing interest in implementing intelligent conversational agents into their products. Many recent studies has tended to focus on possibility of developing task-oriented systems which are able to have long and free social chats that occur naturally in social human interactions. In order to better understand the user’s expression, and then feedback the correct information, natural language understanding plays an extremely important role. Despite progress made in solving NLP problems, it remains very challenging today in the field of dialogue systems. In this paper, we review the recent progress in developing dialogue systems, its current architecture features and further prospects. We focus on the natural language understanding tasks which are key for building a good conversational agent, and than we are summarizing NLP methods and frameworks, in order to allow researchers to study the potential improvements of the state-of-the-art dialogue systems. Additionally, we consider the dialogue concept in context of human-machine interaction, and briefly describe dialogue evaluation metrics.
The paper describes the structure and possible applications of the theory of K-representations (knowledge representations) in bioinformatics and in the development of a Semantic Web of a new generation. It is an original theory of designing semantic-syntactic analyzers of natural language (NL) texts with the broad use of formal means for representing input, intermediary, and output data. The current version of the theory is set forth in a monograph by V. Fomichov (Springer, 2010). The first part of the theory is a formal model describing a system consisting of ten operations on conceptual structures. This model defines a new class of formal languages – the class of SK-languages. The broad possibilities of constructing semantic representations of complex discourses pertaining to biology are shown. A new formal approach to developing multilingual algorithms of semantic-syntactic analysis of NL-texts is outlined. This approach is realized by means of a program in the language PYTHON.
The book contains the proceedigs of the 18th International Conference on Automatic Processing of Natural Langage (France, Montpellie, 27th June - 1st July 2011).
This paper presents an algorithm that allows the user to issue a query pattern, collects multi-word expressions (MWEs) that match the pattern, and then ranks them in a uniform fashion. This is achieved by quantifying the strength of all possible relations between the tokens and their features in the MWEs. The algorithm collects the frequency of morphological categories of the given pattern on a unified scale in order to choose the stable categories and their values. For every part of speech, and for all of its categories, we calculate a normalized Kullback-Leibler divergence between the category’s distribution in the pattern and its distribution in the corpus overall. Categories with the largest divergence are considered to be the most significant. The particular values of the categories are sorted according to a frequency ratio. As a result, we obtain morphosyntactic profiles of a given pattern, which includes the most stable category of the pattern, and their values.
This paper is an overview of the current issues and tendencies in Computational linguistics. The overview is based on the materials of the conference on computational linguistics COLING’2012. The modern approaches to the traditional NLP domains such as pos-tagging, syntactic parsing, machine translation are discussed. The highlights of automated information extraction, such as fact extraction, opinion mining are also in focus. The main tendency of modern technologies in Computational linguistics is to accumulate the higher level of linguistic analysis (discourse analysis, cognitive modeling) in the models and to combine machine learning technologies with the algorithmic methods on the basis of deep expert linguistic knowledge.
This paper concerns discourse-new mention detection in Russian. This might be helpful for different NLP applications such as coreference resolution, protagonist identification, summarization and different tasks of information extraction to detect the mention of an entity newly introduced into discourse. In our work, we are dealing with the Russian where there is no grammatical devices, like articles in English, for the overt marking a newly introduced referent. Our aim is to check the impact of various features on this task. The focus is on specific devices for introducing a new discourse prominent referent in Russian specified in theoretical studies. We conduct a pilot study of features impact and provide a series of experiments on detecting the first mention of a referent in a non-singleton coreference chain, drawing on linguistic insights about how a prominent entity introduced into discourse is affected by structural, morphological and lexical features.
A framework for fast text analysis, which is developed as a part of the Texterra project, is described. Texterra provides a scalable solution for the fast text processing on the basis of novel methods that exploit knowledge extracted from the Web and text documents. For the developed tools, details of the project, use cases, and evaluation results are presented.
This workshop is about major challenges in the overall process of MWE treatment, both from the theoretical and the computational viewpoint, focusing on original research related to the following topics:Manually and automatically constructed resources Representation of MWEs in dictionaries and ontologies MWEs in linguistic theories like HPSG, LFG and minimalism MWEs and user interaction Multilingual acquisition Multilingualism and MWE processing Models of first and second language acquisition of MWEs Crosslinguistic studies on MWEs The role of MWEs in the domain adaptation of parsers Integration of MWEs into NLP applications Evaluation of MWE treatment techniques Lexical, syntactic or semantic aspects of MWEs
Due to evolutionary increase of the software complexity, graphical user interfaces have been becoming more and more complex, and often this process encounters the restricted possibilities of reflecting graphics on a display. This paper describes an original approach to designing natural language interfaces for the action-based applications as the complements and alternatives to traditional graphical interfaces. The principal peculiarity of this approach is that it is based on a formal model of natural language structured meanings. The first part of the paper outlines the main properties of the class of SK-languages (standard knowledge languages) which is used for building semantic representations of user commands. The class of SK-languages can be employed for representing the meanings of input texts from practically interesting sublanguages of Russian, English, French, German and other natural languages. The theory of K-representations (knowledge representations), introducing the class of SK-languages, was developed by V. A. Fomichov and described in his monograph published by Springer in 2010. The second part presents an application of this theory to the design of a natural language interface to an action-based software system, it is called NLC-1 (Natural Language Commander – Version 1).
We consider certain spaces of functions on the circle, which naturally appear in harmonic analysis, and superposition operators on these spaces. We study the following question: which functions have the property that each their superposition with a homeomorphism of the circle belongs to a given space? We also study the multidimensional case.
We consider the spaces of functions on the m-dimensional torus, whose Fourier transform is p -summable. We obtain estimates for the norms of the exponential functions deformed by a C1 -smooth phase. The results generalize to the multidimensional case the one-dimensional results obtained by the author earlier in “Quantitative estimates in the Beurling—Helson theorem”, Sbornik: Mathematics, 201:12 (2010), 1811 – 1836.
We consider the spaces of function on the circle whose Fourier transform is p-summable. We obtain estimates for the norms of exponential functions deformed by a C1 -smooth phase.
I give the explicit formula for the (set-theoretical) system of Resultants of m+1 homogeneous polynomials in n+1 variables