What Is Data Availability?
Data availability is a term used by computer storage manufacturers and storage service providers to describe products and services.
- Data availability is a vocabulary used by computer storage manufacturers and storage service providers (SSPs) to describe products and services that are used to ensure that performance is maintained in normal to "crash" environments At a required level, data must be available. In general, data availability is archived through redundant data storage locations and the way it is achieved. Some providers describe a need: have a data center and a storage-centric concept rather than a service-centric philosophy and environment.
- In large enterprise computer systems, computers typically access data through high-speed fiber optics connected to storage devices. Among the most well-known systems, database access is ESCON and Fibre Channel. Storage devices are usually controlled as redundant arrays of independent disks (RAID). The flexibility to add and reconfigure storage systems and the automatic transition to a backup or error recovery environment are both programmable or manually controlled switches, which are often called controllers.
- Two increasingly popular ways to provide data availability are Storage Area Networks (SANs) and Network Attached Storages (NASs). Data availability can be measured by the percentage of data available (the vendor provides 99.999% availability) and how much data can flow at the same time (the same vendor promises a rate of 3200 megabytes per second). [1]
- Researchers generally believe that the availability of data can be examined from five aspects: data consistency, accuracy, completeness, timeliness, and entity identity. The specific definition is as follows [2]
- Analysis For data availability assessment, researchers at home and abroad have also done a lot of work. The following are introduced and analyzed from the five aspects of data consistency, accuracy, completeness, timeliness, and entity identity [2]
- Ensuring data availability is a difficult task. Considering the four characteristics of large data, large data generation speed, complex data types, and large value and low density, ensuring the availability of big data will become more difficult. We need to address the four characteristics of big data and solve the following five challenging research questions on big data availability [3] .
- Theory and technology of high quality big data acquisition and integration
- The acquisition of high-quality data is an important prerequisite for ensuring the availability of information. There are many sources of massive data (such as complex physical information systems, the Internet of Things, and data resources on the Internet), and data modalities vary widely (such as relational data, XML data, graph data, stream data, scalar data, and vector data) It is uneven, and processing integration is difficult. These problems are particularly serious in the context of today's rapid advances in sensor networks, cyber-physical fusion systems, and the Internet of Things and the big data they generate. Therefore, we need to solve the following challenging problems: check the quality at the data acquisition stage, explore theories and methods to effectively obtain high-quality big data from multiple data sources such as physical information systems, study efficient data filtering methods, and establish multimodality The theory and algorithm of big data fusion computing achieve high-quality data acquisition and precise integration, and then discover the data evolution law.
- Complete big data availability theory system
- In data availability research, we must answer the following questions: How to express data availability formally? How to judge the availability of data in theory? How to quantitatively assess data availability? What is the theoretical basis for automatic data error detection and repair? What is the theoretical basis of data and data quality fusion management (referred to as quantitative and qualitative fusion management)? How does the data evolve? Without a complete data availability theory, these questions cannot be answered. Therefore, we need to establish a unified framework and propose a complete data availability theory system to solve the following challenging problems: establish a theoretical model of big data availability, formal systems and inference mechanisms for big data availability, big data availability assessment theories and algorithms, Theories and algorithms of big data quantitative and qualitative management, the evolution mechanism of big data, the complexity theory of big data availability and the new methods of algorithm design and analysis.
- Theory and technology of automatic data error detection and repair
- The existing methods and systems for data availability lack a solid theoretical foundation and cannot implement automatic error detection and repair. In order to realize the automatic detection and repair of data errors, we need to solve the following challenging problems based on the data availability theory system: put forward the computability theory of automatic detection and repair of big data errors, calculation of automatic detection and repair of big data errors Complexity theory, credibility theory for automatic detection and repair of big data errors, efficient and practical algorithm for automatic detection and repair of big data errors.
- Theories and techniques for approximate calculations on weakly available data
- When errors in the data cannot be completely repaired, they are called weakly available data. It is a meaningful choice to perform approximate calculations directly on weakly available data to meet a given accuracy requirement. Unfortunately, existing theories and algorithms cannot support approximate calculations on weakly available data. Therefore, we need to solve the following challenging problems: put forward the feasibility theory of approximate calculation of weakly available big data, the calculation complexity theory of approximate calculation of weakly available big data, the theory of quality evaluation of approximate calculation results on weakly available big data, and weak availability Approximate calculation method on big data.
- Mechanism of knowledge discovery and evolution on weakly available data
- The usability problem of big data inevitably leads to the usability problem of knowledge derived from data. When the data is fully available, research on discovering knowledge from correct big data and exploring the mechanism of knowledge evolution from data evolution has become difficult. When data is weakly available, knowledge discovery and evolutionary mechanisms on weakly available big data will be more difficult. We need to solve the following challenging problems: put forward the theory and methods of knowledge availability assessment derived from weakly available data, the correlation theory of data availability and knowledge availability, the computational complexity theory and algorithm design and analysis of knowledge discovery on weakly available big data Methods, theories and methods of knowledge verification and correction from weakly available data, and the mechanism of knowledge evolution from weakly available data. In summary, the availability of big data raises serious and challenging research issues at all levels of basic theory, algorithms, and engineering technology. At present, the research work on the usability of big data has just begun. It has only touched on a few aspects. A large number of scientific and technological problems have yet to be solved. It has brought us new challenges and also provided us with new opportunities.