What is a Digital Signal Processor?
Digital signal processing is the theory and technique of digitally representing and processing signals. Digital signal processing and analog signal processing are a subset of signal processing. The purpose of digital signal processing is to measure or filter continuous analog signals in the real world. Therefore, before performing digital signal processing, the signal needs to be converted from the analog domain to the digital domain, which is usually achieved by an analog-to-digital converter. The output of digital signal processing is often transformed into the analog domain, which is achieved by a digital-to-analog converter.
- Digital signal processors can be divided into two categories: programmable and non-programmable according to their programmability. The non-programmable signal processor takes the flow of the signal processing algorithm as the basic logical structure, and without a control program, generally can only complete one main processing function, so it is also called a dedicated signal processor. Such as fast Fourier transform processor, digital filter and so on. Although this type of processor has limited functions, it has higher processing speed. The programmable signal processor can be programmed to change the functions to be completed by the processor. It has greater versatility, so it is also called a general-purpose signal processor. As the performance-price ratio of general-purpose signal processors continues to increase, its application in signals is becoming increasingly popular.
- There are roughly three types of programmable signal processors that have been developed: The bits are mainly composed of micro-processing chips with a basic bit length of 2, 4, 8 bits, and are equipped with program control chips, interrupts and DMA control chips, and clock chips. Using microprogram control and grouping instruction format, the system with required word length can be constructed as required. Its advantages are fast processing speed and high efficiency. The disadvantage is that the power consumption is large, and the number of chips is also large. Monolithic signal processor. It integrates arithmetic unit, multiplier, memory, program read-only memory (ROM), input / output interface, and even analog / digital / digital / analog conversion on a single chip. It has fast calculation speed, high precision, low power consumption and strong versatility. Compared with general-purpose microprocessors, its instruction set and addressing mode are more suitable for arithmetic and data structures commonly used in signal processing. VLSI array processor. This is a signal processor that uses a large number of processing units to complete the same operation on different data under the control of a single instruction sequence, thereby obtaining high-speed calculations. It is very suitable for signal processing tasks with a large amount of data, a large amount of calculation, and a highly repetitive operation. They are often used in conjunction with general-purpose computers to form powerful signal processing systems. There are roughly two types of existing array processors, namely pulsed array processors.
- And wave array processors. The former uses a unified synchronous clock and control drive mechanism of the entire array, which has the advantages of simple structure, good modularity, and easy expansion. The latter uses independent timing and data-driven mechanisms for each unit. Bring convenience to programming and fault-tolerant design, and increase processing speed [2]
- Digital signal processors have developed from dedicated signal processors in the 1970s to VLSI array processors, and their application fields have evolved from the processing of low-frequency signals such as voice and sonar to large-scale signals such as radar and image deal with. Due to the use of floating point arithmetic and parallel processing technology, the signal processor's processing capability has been greatly improved. The digital signal processor will also continue to develop in the direction of improving processing speed and operation accuracy. The data flow structure on the architecture and the artificial neural network structure may become the basic structural model of the next generation digital signal processor.
- Algorithm format
- There are many algorithms for DSP. Most DSP processors use fixed-point arithmetic, and numbers are expressed as integers or decimals between -1.0 and +1.0. Some processors use floating-point arithmetic, and the data is expressed as mantissa plus exponent: mantissa x 2 exponent.
- Floating point algorithm is a more complex conventional algorithm. Using floating point data can achieve a large data dynamic range (this dynamic range can be expressed by the ratio of the maximum and minimum numbers). In the application of floating-point DSP, design engineers do not need to care about issues such as dynamic range and accuracy. Floating-point DSPs are easier to program than fixed-point DSPs, but they are costly and power-hungry.
- Due to cost and power consumption, fixed-point DSPs are generally used for batch products. Programming and algorithm designers use analysis or simulation to determine the required dynamic range and accuracy. If the requirements are easy to develop, and the dynamic range is wide and the accuracy is high, consider using a floating-point DSP.
- It is also possible to implement floating-point calculations in software using fixed-point DSPs, but such software programs consume a lot of processor time and are rarely used. An effective method is "block floating point", which uses a method to process a group of data with the same exponent but different mantissas as data blocks. "Block floating point" processing is usually implemented in software.
- Data width
- The word width of all floating-point DSPs is 32 bits, while the word width of fixed-point DSPs is generally 16 bits. There are also 24-bit and 20-bit DSPs, such as Motorola's DSP563XX series and Zoran's ZR3800X series. Because the word width has a great relationship with the external dimensions of the DSP, the number of pins, and the size of the required memory, the length of the word width directly affects the cost of the device. The wider the word width, the larger the size, the more pins, the larger the memory requirements, and the corresponding increase in cost. In order to meet the design requirements, try to use a small word width DSP to reduce costs.
- When choosing between fixed-point and floating-point, you can weigh the relationship between word width and development complexity. For example, by combining instruction combinations, a 16-bit word-wide DSP device can also implement a 32-bit word-wide double-precision algorithm (of course, the double-precision algorithm is much slower than the single-precision algorithm). If single precision can meet most of the calculation requirements, and only a small amount of code requires double precision, this method is also feasible, but if most calculations require high accuracy, you need to choose a processor with a larger word width.
- Please note that the instruction word and data word widths of most DSP devices are the same, and there are some differences. For example, the ADSP-21XX series of ADI (Analog Devices) data word is 16 bits and instruction word is 24 bits.
- Processing speed
- Whether the processor meets the design requirements depends on whether it meets the speed requirements. There are many ways to test the speed of a processor. The most basic is to measure the processor's instruction cycle, which is the time it takes for the processor to execute the fastest instruction. The reciprocal of the instruction cycle is divided by one million, and then multiplied by the number of instructions executed per cycle. The result is the maximum speed of the processor. The unit is MIPS of million instructions per second.
- However, the execution time of an instruction does not indicate the true performance of the processor. Different processors complete different amounts of tasks in a single instruction. Simply comparing the execution time of an instruction does not justify the difference in performance. Some new DSPs use a very long instruction word (VLIW) architecture. In this architecture, multiple instructions can be implemented in a single cycle time, and each instruction achieves fewer tasks than traditional DSPs, so it is relatively VLIW and general-purpose DSP devices. In terms of comparison, the size of MIPS can be misleading.
- Even comparing MIPS sizes between traditional DSPs has a certain one-sidedness. For example, some processors allow several bits to be shifted together in a single instruction, while some DSPs can only shift a single data bit; some DSPs can parallelize data unrelated to the ALU instruction being executed. Processing (loading operands while executing instructions), while other DSPs can only support parallel processing of data related to the ALU instruction being executed; some new DSPs allow two MACs to be defined within a single instruction. Therefore, the performance of the processor cannot be accurately obtained by comparing only MIPS.
- One way to solve the above problem is to use a basic operation (rather than an instruction) as a standard to compare the performance of the processors. The MAC operation is commonly used, but the MAC operation time cannot provide sufficient information to compare the difference in DSP performance. In most DSPs, the MAC operation is only implemented in a single instruction cycle, and its MAC time is equal to the instruction cycle time. As mentioned above, Some DSPs handle more tasks in a single MAC cycle than others. MAC time does not reflect performance such as loop operations, which are used in all applications.
- The most common approach is to define a set of standard routines that compare execution speed on different DSPs. This routine may be the "core" function of an algorithm, such as a FIR or IIR filter, or it may be a whole or part of an application (such as a speech encoder). Figure 1 shows the performance of several DSP devices tested using BDTI's tools.
- When comparing the speed of a DSP processor, pay attention to its advertised MOPS (million operations per second) and MFLOPS (million floating point operations per second) parameters, because different manufacturers have different understandings of "operations". The meaning of indicators is different. For example, some processors can perform floating-point multiplication and floating-point addition at the same time, thus doubling the MFLOPS of their products to MIPS.
- Second, when comparing processor clock rates, the DSP's input clock may be the same as its instruction rate, or it may be two to four times the instruction rate. Different processors may be different. In addition, many DSPs have clock multipliers or phase-locked loops that can use external low-frequency clocks to generate the high-frequency clock signals required on-chip.
- Practical application
- Speech processing: speech coding, speech synthesis, speech recognition, speech enhancement, voice mail, speech storage, etc.
- Image / Graphics: 2D and 3D graphics processing, image compression and transmission, image recognition, animation, robot vision, multimedia, electronic map, image enhancement, etc.
- Military; confidential communications, radar processing, sonar processing, navigation, global positioning, frequency hopping radio, search and anti-search.
- Instrumentation: spectrum analysis, function generation, data acquisition, seismic processing, etc.
- Automatic control: control, deep space operations, autonomous driving, robot control, disk control, etc.
- Medical: Hearing aids, ultrasound equipment, diagnostic tools, patient monitoring, ECG, etc.
- Home appliances: digital audio, digital television, video phone, music synthesis, tone control, toys and games, etc.
- Examples of biomedical signal processing:
- CT: computer tomography device. (Among them, Hausfield, who invented Skull CT UK EMI, won the Nobel Prize.)
- CAT: Computer X-ray space reconstruction device. Whole body scans, three-dimensional graphics of cardiac activity, foreign bodies in brain tumors, and reconstruction of human torso images. ECG analysis.
- Storage management
- The performance of a DSP is affected by its ability to manage the memory subsystem. As mentioned earlier, MAC and other signal processing functions are the basic capabilities of DSP device signal processing. Fast MAC execution requires reading one instruction word and two data words from the memory every instruction cycle. There are multiple ways to achieve this read, including multi-interface memory (allowing multiple accesses to memory in each instruction cycle), separate instruction and data memory ("Harvard" structure and its derived classes), and instruction cache (allowing from Cache read instructions instead of memory, freeing up memory for data reads). Figures 2 and 3 show the differences between the Harvard memory structure and the "von Norman" structure used by many microcontrollers.
- Also note the amount of memory space supported. The main target market of many fixed-point DSPs is embedded application systems. In this application, the memory is generally small, so this DSP device has small to medium on-chip memory (about 4K to 64K words), with a narrow external data bus. In addition, the address bus of most fixed-point DSPs is less than or equal to 16 bits, so the external memory space is limited.
- Some floating-point DSPs have very small or no on-chip memory, but the external data bus is wide. For example, TI's TMS320C30 has only 6K on-chip memory, and the external bus is a 24-bit, 13-bit external address bus. And ADI's ADSP2-21060 has 4Mb of on-chip memory, which can be divided into program memory and data memory in various ways.
- When selecting a DSP, it needs to be selected according to the size of the storage space of the specific application and the requirements of the external bus.
- Type characteristics
- DSP processor and such as Intel, Pentium or Power
- There are great differences between general purpose processors (GPPs) of PCs. These differences arise from the structure and instructions of DSPs that are specifically designed and developed for signal processing. It has the following characteristics.
- · Hardware multiply and accumulate operations (MACs)
- In order to efficiently perform multiply-accumulate operations such as signal filtering, the processor must perform efficient multiplication operations. GPPs were not originally designed for heavy multiplication operations. The first major technical improvement that distinguishes DSPs from earlier GPPs is the addition of specialized hardware and clear MAC instructions that enable single-cycle multiplication operations.
- · Harvard structure
- Traditional GPPs use Feng. Norman storage structure. In this structure, there is a storage space connected to the processor core through two buses (an address bus and a data bus). This structure cannot meet the requirement that the MAC must perform four operations on the memory in one instruction cycle. Request for second visit. DSPs generally use the Harvard structure. In the Harvard structure, there are two storage spaces: program storage space and data storage space. The processor core is connected to these storage spaces through two sets of buses, allowing two simultaneous accesses to the memory. This arrangement doubles the processor's bandwidth. In the Harvard architecture, sometimes a larger storage bandwidth is achieved by adding a second data storage space and bus. Modern high-performance GPPs usually have two on-chip cache memories, one for data and one for instructions. From a theoretical point of view, this dual on-chip cache and bus connection is equivalent to the Harvard structure. However, GPPs use control logic to determine which data and instruction words reside in the on-chip cache. This process is usually not for programmers As you can see, in DSPs, programmers can explicitly control which data and instructions are stored in on-chip storage units or caches.
- · Zero consumption cycle control
- A common feature of DSP algorithms: Most processing time is spent executing a small number of instructions contained in a relatively small loop. Therefore, most DSP processors have dedicated hardware for zero-consumption loop control. A zero-consumption cycle is a cycle where the processor can execute a set of instructions without spending time testing the value of the cycle counter. The hardware completes the cycle jump and the decay of the cycle counter. Some DSPs also implement high-speed single-instruction loops through a single instruction cache.
- · Special addressing mode
- DSPs often contain special address generators, which can generate special addressing required by signal processing algorithms, such as circular addressing and bit flip addressing. Cyclic addressing corresponds to the pipeline FIR filtering algorithm, and bit flip addressing corresponds to the FFT algorithm.
- Predictability of execution time
- Most DSP applications have hard real-time requirements, and in each case all processing must be completed within a specified time. This real-time limitation requires the programmer to determine how much time each sample will take or at least how much time it will take in the worst case. The process by which DSPs execute programs is transparent to the programmer, so it is easy to predict the execution time for processing each job. However, for high-performance GPPs, the prediction of execution time becomes complicated and difficult due to the use of a large amount of ultra-high-speed data and program caches and the dynamic allocation of programs.
- Has rich peripherals
- DSPs have peripherals such as DMA, serial port, link port, timer [3]
- The DSP processor has a wide range of applications, but in fact no processor can fully meet all or most of the application needs. When designing a processor, the design engineer needs to select the processor based on performance, cost, integration, ease of development, and Factors such as power consumption are comprehensively considered.
- DSP devices can be divided into two categories according to design requirements. The first category is cheap and large-scale applications.
- Digital Signal Processing (DSP) is an emerging discipline that involves many disciplines and is widely used in many fields. Digital signal processing is developed around several aspects of the theory, implementation, and application of digital signal processing. The development of digital signal processing in theory has promoted the development of digital signal processing applications. In turn, the application of digital signal processing has promoted the improvement of digital signal processing theory. The realization of digital signal processing is a bridge between theory and application. Digital signal processing is based on numerous disciplines, and it covers a wide range. For example, in the field of mathematics, calculus, probability statistics, stochastic processes, and numerical analysis are all basic tools for digital signal processing. They are also closely related to network theory, signals and systems, cybernetics, communication theory, and fault diagnosis. Some emerging disciplines, such as artificial intelligence, pattern recognition, and neural networks, are inseparable from digital signal processing. It can be said that digital signal processing takes many classic theoretical systems as its theoretical foundation, and at the same time makes itself a theoretical foundation for a series of emerging disciplines.
- For a long time, signal processing technology has been used to convert or generate analog or digital signals. One of the most frequently used fields is signal filtering. In addition, digital signal processing (DSP) technology is widely used in many fields from digital communication, speech, audio and biomedical signal processing to detection instruments and robotics. Digital signal processing has developed into a mature technology, and has gradually replaced traditional analog signal processing systems in many applications. The world's three major DSP chip manufacturers: 1. Texas Instruments (TI) 2. Analog Devices (ADI) 3. Motorola (Motorola). These three companies have almost monopolized the general DSP chip market. There are many books on digital signal processing. Among them, "Discrete Time Signal Processing" edited by MIT Oppenheim is the most classic, and there is a Chinese version of "Discrete Time Signal Processing" published by Xi'an Jiaotong University [2] .