What Is Document Classification?
A file is a set of related elements with a file name defined by the creator and can be divided into structured files and unstructured files. In a structured file, the file consists of several related records; an unstructured file is considered a character stream. Classified files are files that users or systems classify according to their type or purpose. There are many standards for classifying documents, which generally depend on actual application requirements. The classification of files is mainly to facilitate the management of files.
- A file is the largest unit of data in the file system. It describes a set of objects. For example, you can use student records for a class as a file. A file must have a file name, which is usually composed of a series of ASCII codes or (and) Chinese characters. The length of the name varies from system to system. For example, in some systems, the name is specified as 8 characters, and in some systems, 14 characters are available. The user uses the file name to access the file. In addition, the file should have its own attributes, which can include:
- (1) File type. The types of files can be specified from different perspectives, such as source files, target files, and executable files.
- (2) File length. The file length refers to the current length of the file. The unit of length can be bytes, words, or blocks, or it can be the maximum allowed length.
- (3)
Classification files by purpose
- According to the nature and purpose of documents, documents can be divided into three categories:
- (1) System files . This is a file made up of system software. Most system files are only allowed to be called by users, but they are not allowed to be read or modified; some system files are not directly open to users.
- (2) User files. Refers to a file composed of a user's source code , object file, executable file, or data. The user entrusts these files to the system for safekeeping.
- (3) Library files. This is a file consisting of standard subroutines and commonly used routines. This type of file is allowed to be called by the user, but it is not allowed to be modified.
Classification files are classified by form
- In this way, files can also be divided into three categories:
- (1) Source file. This is a file made up of source programs and data. Files that are usually formed by source programs and data entered by a terminal or input device are source files. It is usually composed of ASCII or Chinese characters.
- (2) The target file. This refers to a file made up of object code that has been compiled by a source language compiler but not yet linked by a linker. It belongs to binary files. Usually, the suffix used in the object file is ".obj".
- (3) executable file. This refers to the file formed by compiling the object code generated after linking through the linker.
Classification files are classified by storage attributes
- Files can be divided into three categories based on the access control attributes specified by the system administrator or user:
- (1) Only execute files. This type of file is only allowed to be executed by authorized users, neither read nor write.
- (2) Read-only files. This type of file is only allowed to be read by the file owner and authorized users, but not written.
- (3) Read and write files. This is a file that allows the owner and authorized users to read or write. [1]
Classification file organization processing classification
- Files can be divided into three categories based on the organization of the files and the way the system handles them:
- (1) Ordinary file: A character file composed of ASCII or binary code. The source program files, data files, object code files, and the operating system's own code files, library files, and utility files created by ordinary users are all ordinary files, which are usually stored on external storage devices.
- (2) Directory file: A system file composed of file directories used to manage and implement the function of the file system. The directory file can retrieve information of other files. Since the directory file is also composed of character sequences, it can perform various file operations on it like ordinary files.
- (3) Special file: It refers to various I / O devices in the system. In order to facilitate unified management, the system treats all input / output devices as files and provides them to users in the form of files. For example, directory retrieval and permission verification are similar to ordinary files, except that the operations on these files are with devices Drivers are closely linked, and the system translates these operations into operations on specific devices. According to the different units of device data exchange, special files can be divided into block device files and character device files. The former is used for I / O operations of block devices such as disks, optical disks, or tapes, while the latter is used for I / O operations of character devices such as terminals and printers.