What Is a Control Character?
Control characters are characters that appear in specific information texts to indicate a certain control function.
- Control Character (Control Character), which appears in a specific information text, is a character representing a certain control function.
- In the calculation, a control character or a non-printing character is a code indicating (a number) whether it is in the character set, which itself is also described by writing letters. All ASCII tables with output below 32 are of this type, including
- BEL (requires audible signal in response to the reception of the terminal);
- SYN (synchronization signal);
- ENQ (requires a response upon receipt to verify existence);
- The Unicode standard adds many new non-printing characters, such as Zero-Width Non-Joiner [1]
- An ASCII-based keyboard has a key labeled "Control" or "Ctrl" (sometimes also called "Cntl"), which is very similar to the shift key, that is, pressed with other letter or symbol keys. By using the control keys in this way, the 7 characters of the ASCII coded byte that generates the character keys that are pressed at the same time, the two characters from the left are strongly specified as 0; thereby generating one of the 32 ASCII control codes. For example, pressing CTRL and the letter G (71 in decimal and 01000111 in binary) produces code 7 (ringing symbol, 7 in decimal, or 00000111 in binary).
- Some individual keys on the keyboard can generate control codes. For example, the key labeled "Backspace" usually produces code 8, "Tab" is code 9, and "Enter" or "Return" is code 13 ("Enter" on some keyboards may be code 10).
- Some keys on current keyboards do not have corresponding ASCII characters or control characters, such as cursor control keys / arrow keys and word processing function keys. These keyboards and their connected computers communicate through three methods: defining new usage of some control characters that are not otherwise used, using other non-ASCII encodings, or using control sequences composed of multiple characters. The keyboard connected to a stand-alone personal computer is usually used in one or both of the first two methods, and the dumb terminal usually uses a control sequence.
- Control characters are designed into several groups: print and display control, data structuring, transmission control, and other scattered uses.
Controlling character printing and display control
- Print control characters were first used to control the physical mechanisms of the printer as the earliest output device. A carriage return (CR) means that the character is placed on the edge of the paper to start printing (may or may not move to the next line). A new line (LF) means that the next character is placed on the next line in the direction in which the new line appears (it may or may not be moved to the beginning of the line at the same time). Vertical and horizontal tabs (VT / HT) request the printer to move the print head to the next tab position in the reading direction. Form feed (FF) starts a new sheet of paper. Backspace (BS) Backs the next print position one character so that the printer can overlap to produce special characters (for example, underline the text. On earlier character printers, type the text first and then use the backspace character to print the head. Step back and underline again). Shift-in (SI) and shift-out (SO) are used to select alternate character sets, fonts, underscores, or other print modes, however it is more common to use other escape sequences for these purposes.
- With the advent of terminals that print without paper and provide more flexibility in terms of character placement, deletion, etc., print control codes have further adapted to these changes. For example, a feed means to clear the screen instead of feeding the next blank sheet. People have designed more complex escape sequences to take advantage of new terminal and new printer features. The control code of a single character is not enough to support all the functions of the new peripheral device, and the difference between the control character and the escape sequence has also become blurred [2] .
Structure of control character data
- Delimiters (groups, records, etc.) are used to structure data, and are often used on magnetic tapes to simulate punched cards. End of Media (EM) means a warning tape (or other media) is about to reach the end.
Control Character Transmission Control
- Transmission control characters are designed to structure packets and control when retransmissions occur when transmission errors occur.
- Head of header (SOH) is used to mark the non-data part of the packet-that is, the part containing the address and other housekeeping data messages. The start of body (SOT) marks the end of the header and the beginning of the body. End of text (EOT) marks the end of the message data. The standard convention is to fill in the checksum or CRC of the message two characters before the end of the text.
- The escape character (ESC) is used in a message to place / * in front of a binary value that would normally be interpreted as a control character to prevent the character from being interpreted as a control character * /. For example, the correct use of the binary value 27 is ESC ESC.
- The substitution character (SUB) is used to request that the next printable character be converted to a binary value, usually the fifth position is zero. Since some transmission media, such as paper produced by a typewriter, can only transmit printable characters, this is convenient for such situations.
- A cancel character (CAN) aborts the transmission of a packet. A denial character (NAK) requests retransmission of a packet. An acknowledgment (ACK) indicates that the transmission was received correctly.
- When the transmission medium uses half-duplex (referring to transmission in one direction at a time), there is usually a master station that can transmit data at any time and one or more slave stations that can transmit after obtaining permission. The master uses the enquiry character (ENQ) to request the slave to send its next message. The slave indicates that it has completed the transmission by issuing an end-of-transmission character (EOT).
- The device control code was originally not specific and was defined differently for each device. However, a common need in data transmission is to request the sender to suspend transmission immediately when the receiver cannot receive more data. The data equipment company invented a set of protocols that uses 19 (DC3, CTRL-S or XOFF) to stop the transmission, and 17 (DC1, CTRL-Q or XON) to start the transmission. This eliminates the need for manufacturers to control transmission with a dedicated transmission control line in the data cable, which saves costs and increases the reliability of the operation by reducing the number of connections in the cable.
- Data link escape (DLE) tells the other end of the data link to end a session. / * So that the other party can release line resources, etc. * /
Loose use of control characters
- Many ASCII control characters were designed for devices that were rarely used at the time. For example, code 22, Synchronous Idle (SYN), was originally used for synchronous modems (which must send data continuously) to send when there is no data to transmit. (Current systems typically use a start bit to tell the beginning of the word to be transmitted.)
- Encoding 0, a null character, is a special case. It is the place where there is no perforation in the paper tape, so it is convenient to treat it as a non-existent character.
- Code 127 is also a special case. In binary coding, all its bits are 1, which makes it convenient to clear a section of paper tape, which is a storage medium commonly used at the time, and perforate all the parts to be cleared into DEL characters. The paper tape was quickly discarded, so this feature was rarely used.
- However, because its encoding is in the area occupied by other printable characters, many computers use it as an extra printable character (usually a solid black square character that can be overwritten to cover text).