What Is the Real-Time Transport Protocol?
Real-time Transport Protocol ( RTP ) is a network transmission protocol, which was published by the IETF's Multimedia Transmission Working Group in 1996 in RFC 1889.
- The real-time transfer protocol (RTP) provides end-to-end transfer services with real-time characteristics for data, such as
- The RTP standard defines two sub-protocols, RTP and RTCP.
- Data transfer protocol RTP, used to transfer data in real time. The information provided by the protocol includes: timestamp (for synchronization), sequence number (for packet loss and reordering detection), and payload format (encoding format used to describe data).
- Control protocol RTCP for QoS feedback and synchronization of media streams. Compared with RTP, the bandwidth occupied by RTCP is very small, usually only 5%.
- RTP uses an even port number to receive and send data, and the corresponding RTCP uses the next odd port number next to it.
- RTP provides mechanisms for jitter compensation and data out-of-order detection. Due to the transmission characteristics of IP networks, disorderly arrival of data is very common. RTP allows data to be transmitted to multiple destinations through IP multicast. RTP is considered the basic standard for transmitting audio and video in IP networks. RTP is usually used with templates and payload formats.
- For real-time multimedia streaming applications, transmitting information in a timely manner is the primary goal. To achieve the goal, some packet loss can be tolerated. For example, a packet loss in audio applications may result in the loss of one second of audio data. This can easily be masked by a suitable hiding algorithm, so that it cannot be noticed. Because TCP pays more attention to reliability than timeliness, it is rarely used in RTP applications. Instead, most RTP implementations are based on UDP.
- Each multimedia stream will establish an RTP session. A session contains an IP address with RTP and RTCP port numbers. For example, audio and video streams use separate RTP sessions so that users can select one of the media streams. The ports forming the session are negotiated by other protocols, such as RTSP and SIP. RTP and RTCP use UDP ports 1024-65535.
- The RTP message consists of two parts: the header and the payload. The RTP header format is shown in the figure, where:
- l V: RTP protocol version number, occupying two digits. The current protocol version number is 2.
- l P: Fill flag, occupying 1 bit. If P = 1, one or more extra octets are filled in the tail of the message, they are not part of the payload.
- l X: extended flag, occupying one bit. If X = 1, there is an extended header followed by the RTP header.
- l CC: CSRC counter, occupying 4 bits, indicating the number of CSRC identifiers.
- l M: mark, 1 bit. Different payloads have different meanings. For video, mark the end of a frame; for audio, mark the beginning of a session.
- l Synchronous source (SSRC) identifier: 32 bits, used to identify the source of the synchronous source. The identifier is randomly selected, and two simultaneous sources participating in the same video conference cannot have the same SSRC.
- l Contributing Source (CSRC) Identifier: Each CSRC identifier occupies 32 bits and can have 0-15. Each CSRC identifies all special sources included in the RTP message payload.
- l PT: Payload type, occupying 7 bits. It is used to describe the type of payload in RTP packets, such as GSM audio and JPEM images.
- l Serial number: It takes 16 bits and is used to identify the serial number of the RTP message sent by the sender. The sequence number is incremented by one each time a message is sent. The receiver detects the message loss through the sequence number, reorders the messages, and recovers the data.
- l Timestamp: 32 bits. The timestamp reflects the sampling time of the first octet of the RTP message. The receiver uses time stamps to calculate delay and delay jitter, and performs synchronization control.
|
|
|
|
|
|
|
| ||||||
| ||||||
| ||||||
|
- RTP header format
- The synchronous source here refers to the source that generates the media stream. It is identified by a 32-bit SSRC identifier in the RTP header and does not depend on the network address. The receiver will distinguish different sources based on the SSRC identifier. To group RTP messages. The special source means that when the mixer receives RTP messages from one or more synchronous sources, it generates a new combined RTP message through the mixing process, and uses the mixer as the SSRC of the combined RTP message. All SSRCs are transmitted to the receiver as CSRC, so that the receiver knows the SSRCs that make up the combined message.
- RTCP Control Protocol (RTCP)-monitors the quality of service and transmits information about ongoing session participants. The second aspect of RTCP is sufficient for "loosely controlled" sessions, that is, it does not have to be used to support all control communication requests for an application without explicit member control and organization. [1]
- Ver. (2 digits) is the version number of the agreement. P (1 bit) is reserved space for the end point of the RTP packet, depending on whether the packet requires extra padding space. Whether X (1 bit) is using extended space in the packet. . The CC (4 bits) contains the number of CSRCs for the fixed header. M (onebit) is the definition of the application level and its profile. If it is not zero, the data has a special program interpretation. PT (7bits) refers to the format of the payload and determines how it will be interpreted by the application. SSRC is the source of synchronization. [1]