What Is Offline Data Entry?
Offline calculation is a calculation performed on the premise that all input data is known before the calculation starts, the input data will not change, and the result will be obtained immediately after solving a problem. The big data belongs to the calculation part of the data, and the part corresponding to the offline calculation in this part is the real-time calculation.
- 1) huge amount of data and long storage time;
- 2) Perform complex batch operations on large amounts of data;
- 3) The data is fully in place before calculation and will not change;
- 4) Can easily query the results of batch calculations;
- in
- Hadoop
- The core design of the Hadoop framework is:
- Hadoop Distributed File System (HDFS) is designed as a distributed file system suitable for running on commodity hardware. It has a lot in common with existing distributed file systems. But at the same time, it is also very different from other distributed file systems. HDFS is a highly fault-tolerant system suitable for deployment on inexpensive machines. HDFS can provide high-throughput data access, which is very suitable for applications on large-scale data sets. HDFS relaxed some POSIX constraints to achieve the purpose of streaming file system data. HDFS was originally developed as the infrastructure of the Apache Nutch search engine project. HDFS is part of the Apache Hadoop Core project.
- HDFS is fault-tolerant and is designed to be deployed on low-cost hardware. And it provides high throughput (high throughput) to access application data, suitable for those applications with large data sets (large data set). HDFS relaxes (relax) the requirements of POSIX (streaming access) to the data in the file system. [7]