Abstracting concepts from text documents by using an ontology
The taxonomy of common northern nudibranch molluscs of the genus Dendronotus in the vast cold regions of Eurasia remains largely unknown. Abundant material collected in many localities from the Barents Sea, via the Arctic region, to the north-west Pacific was analysed for the first time. An integrated approach combining morphological and ontogenetic data with molecular four-gene (COI, 16S, H3, and 28S) analysis reveals seven species, including three previously undescribed. Dendronotus frondosus (Ascanius, 1774) and Dendronotus dalli Bergh, 1879 were commonly considered as amphiboreal species; however, according to this study they are restricted to the North Atlantic and the North Pacific, respectively. In the north-west Pacific two new species were discovered, Dendronotus kamchaticus sp. nov. and Dendronotus kalikal sp. nov., that are externally similar to D. frondosus, but that show significant distance according to molecular analysis and are considerably different in radular morphology. In the North Atlantic a new species Dendronotus niveus sp. nov., sibling to North Pacific D. dalli, is revealed. The separate status of North Atlantic Dendronotus lacteus (Thompson, 1840) is confirmed, including considerable range extension. The essential similarity of early ontogenetic stages of radular development common for species with disparate adult radular morphology (such as D. frondosus and D. dalli) is shown, and its importance for taxonomy is discussed.
A two-step approach to taxonomy construction is presented. On the first step the frame of taxonomy is built manually according to some representative educational materials. On the second step, the frame is refined using the Wikipedia category tree and articles. Since the structure of Wikipedia is rather noisy, a procedure to clear the Wikipedia category tree is suggested. A string-to-text relevance score, based on annotated suffix trees, is used several times to 1) clear the Wikipedia data from noise; 2) to assign Wikipedia categories to taxonomy topics; 3) to choose whether the category should be assigned to the taxonomy topic or stay on intermediate levels. The resulting taxonomy consists of three parts: the manully set upper levels, the adopted Wikipedia category tree and the Wikipedia articles as leaves.Also, a set of so-called descriptors is assigned to every leaf; these are phrases explaining aspects of the leaf topic. The method is illustrated by its application to two domains: a) Probability theory and mathematical statistics, b) “Numerical analysis” (both in Russian).
Abstract. A suffix-tree based method for measuring similarity of a key phrase to an unstructured text is proposed. The measure involves less computation and it does not depend on the length of the text or the key phrase. This applies to the following tasks in semantic text analysis:
Finding interrelations between key phrases over a set of texts;
Annotating a research article by topics from a taxonomy of the domain;
Clustering relevant topics and mapping clusters on a domain taxonomy.
Any company consists of business processes, and enterprise management is actually performed by managing them. Every process requires proper information support and cannot be managed successfully without it, hence neither can the whole company. However creating such support system is quite a serious problem. Content analysis based on corporative taxonomies use is a simple but very effective way of enterprise knowledge management. It provides serious competitive advantages and new leverages of business process optimization. Being combined with sophisticated entropy approach it gives unbelievable results of real time on-line business process monitoring. General idea consists of an interpretation of well-known Schenon theory of message entropy calculation on a basis of standard alphabet into widening standard alphabet in a set of business process instances. Finally we get a procedure of getting a set of taxonomies which create a basis for a business process alphabet and a calculation of a real entropy in the business process transactions which itself reflects a stability and correctness of business process execution. Thus here is the simplest method of BPM practice, and the most inexpensive.
Species of the genus Dendronotus are among the most common nudibranchs in the northern Hemisphere. However, their distribution and composition in the North-west Pacific remain poorly explored. In the present study, we observed Dendronotus composition in northwestern part of the Sea of Japan, using an integrative approach, included morphological and molecular phylogenetic analyses and molecular species delimitation methods. These multiple methods revealed high cryptic diversity within the genus. Two specimens of Dendronotus frondosus were found in Amursky Bay and therefore its amphiboreal status was confirmed. In three locations of the Sea of Japan we found specimens, which are very close externally to D. frondosus, but show significant distance according to molecular analysis. We show that these specimens belong to a new species Dendronotus dudkai sp.n. This species is sister to D. frondosus according to morphological and molecular data, therefore the question of sympatric coexistence is discussed. For the first time Dendronotus kamchaticus was registered in the Sea of Japan and updated information of some intraspecific variation of this species is provided.