The Concept of Information in Big Data Processing
The need to transform existing algorithms in Big Data Systems is considered. The transformation must allow independent and parallel processing of separate fragments of data. The characteristic aspects of a well-organized intermediate compact form of information and its natural algebraic properties are studied and an illustrative example is provided.
In big data problems the data usually are collected on many sites, have a huge volume, and new pieces of data are constantly generated. It is often impossible to collect all the data needed for a research project on one computer, and even impractical, since one computer would not be able to process it in a reasonable time. An appropriate data analysis algorithm should, working in parallel on many computers, extract from each set of raw data some intermediate compact “information”, gradually combine and update it, and finally, use the accumulated information to produce the result. When new data appears, it must extract information from them, add it to the accumulated one, and eventually update the result. We consider several examples of a suitable transformation of processing algorithms, discuss specific features of the emerging information spaces and, in particular, their algebraic properties. We also show that the information space often can be equipped with an order relation that reflects the "quality" of the information.
The article defines the concept, structure and contents of the intellectual potential of society and specifies the limits of the information space in which various crimes infringe on this potential. It also outlines the range of the said crimes and describes ways to enhance the efficiency of criminal law to counteract them. The author emphasizes the role of university scholarship in augmenting the aforementioned potential and in the innovative development of economy, as well as in the protection of creative workers' rights and lawful interests.
The article tracks the preservation and development the of Russian-language information space as a social and cultural phenomenon. The author studies various ways to overcome its disintegration, looks at issues connected with necessity of harmonizing two principles - freedom of information and securing intellectual property rights. He presents proposals about methods of legal regulation of these problems.
The Data in “big data” sets, as a rule, have a huge volume, are distributed among numerous sites and are constantly replenished. As a result even a simplest analysis of big data faces serious difficulties. To apply traditional processing all the relevant data has to be collected in one place and arranged in the form of convenient structures. Only then the corresponding algorithm processes these structures and produces the result of analysis. In the case of big data, it can be just impossible to collect all the relevant data on one computer, and even impractical, since one computer would not be able to process them in a reasonable time. An appropriate data analysis algorithm should, working in parallel on many computers, extract from each set of raw data some intermediate compact “information”, gradually combine and update it, and finally, use the accumulated information to produce the result. Upon arrival of new pieces of data, it should be able to add them to the accumulated information and eventually renew the result. We will discuss specific features of such well-arranged intermediate form of information, reveal its natural algebraic properties, and present several examples. We will also see that in many important data processing problems the appropriate information space may become equipped with an ordering which reflects the “quality” of the information. It turns out that such an intermediate form of information representation in some sense reflects the very essence of the information contained in the data. This leads us to a completely new, ‘practical’ approach to the notion of information.
This proceedings publication is a compilation of selected contributions from the “Third International Conference on the Dynamics of Information Systems” which took place at the University of Florida, Gainesville, February 16–18, 2011. The purpose of this conference was to bring together scientists and engineers from industry, government, and academia in order to exchange new discoveries and results in a broad range of topics relevant to the theory and practice of dynamics of information systems. Dynamics of Information Systems: Mathematical Foundation presents state-of-the art research and is intended for graduate students and researchers interested in some of the most recent discoveries in information theory and dynamical systems. Scientists in other disciplines may also benefit from the applications of new developments to their own area of study.
The papers in this book comprise the proceedings of the 46th International Conference on Parallel Processing Workshops — ICPPW 2017 — 14 August 2017 Bristol, United Kingdom.
The author investigates issues related to the methodology and technology of applying the knowledge management system in an organization, describes infrastructure types needed to successfully practice a complex project of introducing an organizational, social and technological system of knowledge management.
A form for an unbiased estimate of the coefficient of determination of a linear regression model is obtained. It is calculated by using a sample from a multivariate normal distribution. This estimate is proposed as an alternative criterion for a choice of regression factors.