Evaluating Array DBMS Compression Techniques for Big Environmental Datasets
Immense volumes of geospatial arrays are generated daily. Examples of such include satellite imagery, numerical simulation, and derivative dataavalanche. Array DBMS are one of the prominent tools for working with large geospatial arrays. Usually the arrays natively come as raster files. ChronosDB is a novel distributed, file based, geospatial array DBMS: chronosdb.gis.land . ChronosDB operates directly on raster files, delegates array processing to existing elaborate command line tools, and outperforms SciDB by up to 75 × on average. This demonstration will showcase three new components of ChronosDB enabling users to interact with the system and appreciate its benefits: (i) a WebGUI (edit, submit queries and get the output), (ii) an execution plan explainer (investigate the generated DAG), and (iii) a dataset visualizer (display ChronosDB arrays on an interactive web map).
After huge amount of big scientific data, which needed to be stored and processed, has emerged, the problem of large multidimensional arrays support gained close attention in the database world. Devising special database engines with support of array data model became an issue. Development of a well-organized database management system which stands on completely uncommon data model required performing the following tasks: formally defining a data model, building a formal algebra operating on objects from the data model, devising optimization rules on logical level and then on the physical one. Those tasks has already been completed by creators of different array databases. In this paper array formalization, core algebra and optimization techniques are revised using examples of AML, RasDaMan, SciDB – developed array database management systems with different algebras and optimization approaches.