What Is Computer Architecture?

Computer architecture refers to the computer theory components and the basic working principles and theories of computers divided according to different attributes and functions. Among them, the computer theory component is not only linked to a certain actual hardware. For example, the storage part includes registers, memory, and hard disk.

Computer architecture is the attributes of the computer as seen by the programmer, that is, the logical structure and functional characteristics of the computer, including the interrelationships between its hard and soft components. For computer system designers, computer architecture refers to the study of the basic design ideas of the computer and the resulting logical structure; to program designers, it refers to the functional description of the system (such as instruction set, programming method, etc.) [1]
Computer architecture refers to the system structure of software and hardware, which has two meanings: First, the system structure seen from the perspective of the programmer, it is the study of the conceptual structure and functional characteristics of the computer system, which is related to the characteristics of software design The second is the system structure seen from the perspective of the hardware designer, which is actually the composition or implementation of the computer system (see
The conceptual structure and functional characteristics of a computer refer to the attributes of the computer system as seen by the system programmer, as well as the logical structure of the computer system as seen by the machine designer. In short, it is a detailed description of the interrelationships between the parts that make up a computer.
Computer architecture has gone through four different stages of development.
Computer architecture solves the problems that computer systems need to solve in general and in function. It is different from computer composition and computer implementation. An architecture may have multiple components, and one component may have multiple physical implementations.
Computer architecture is based on Turing machine theory and belongs to the von Neumann architecture. In essence, Turing machine theory and von Neumann architecture are one-dimensional serial, while multi-core processors belong to a distributed discrete parallel structure, and the mismatch between the two needs to be solved.
First, the matching problem between serial Turing machine models and physically distributed multi-core processors. The Turing machine model means a serial programming model. It is difficult for serial programs to use multiple processor cores that are physically distributed to achieve performance acceleration. At the same time, the parallel programming model has not been well promoted, and is limited to limited areas such as scientific computing. Researchers should seek suitable The mechanism is used to realize the matching problem of serial Turing machine model and physically distributed multi-core processors, or to reduce the gap between the two, and to solve the problem of difficulty in programming parallel programs and small acceleration of serial programs.
In terms of supporting multi-threaded parallel applications, future multi-core processors should be considered from the following two directions. The first is the introduction of new programming models that better represent parallelism. Because the new programming model allows programmers to explicitly express the parallelism of the program, it can greatly improve performance. For example, the Cell processor provides different programming models for supporting different applications. The difficulty lies in how to effectively promote the programming model and how to solve the problem of compatibility. The second type of direction is to provide better hardware support to reduce the complexity of parallel programming. Parallel programs often need to use the locking mechanism to achieve synchronization and mutual exclusion of critical resources. The programmer must carefully determine the location of the lock, because the conservative locking strategy limits the performance of the program, and the precise locking strategy greatly increases the programming. Complexity. Some studies have made effective explorations in this regard. For example, the Speculative Lock Elision mechanism allows lock operations performed by programs to be ignored without conflicts, thus reducing the complexity of programming while taking into account the performance of parallel program execution. Such a mechanism allows programmers to focus on the correctness of the program without having to consider the performance of the program too much. More radically, the Transactional Coherence and Consistency (TCC) mechanism considers data consistency issues in units of multiple fetch operations (Transactions), further simplifying the complexity of parallel programming.
The mainstream commercial multi-core processors are mainly aimed at parallel applications. How to use multi-core to accelerate serial programs is still a question worthy of attention. The key technology is to use software or hardware to automatically derive code or threads from a serial program that can be executed in parallel on a multi-core processor. There are three main methods for multi-core accelerated serial programs, including parallel compilers, speculative multithreading, and thread-based prefetch mechanisms. In traditional parallel compilation, the compiler needs to spend a lot of effort to ensure that there are no data dependencies between the threads to be divided. There are a lot of ambiguous dependencies when compiling, especially when pointers (such as C programs) are allowed, the compiler has to adopt a conservative strategy to ensure the correct execution of the program. This greatly limits the degree of concurrency that serial programs can mine, and also determines that parallel compilers can only be used in a narrow range. To solve these problems, people have proposed speculative multithreading and thread-based prefetch mechanisms. However, from the time this concept was proposed until now, most of the research in this direction has been confined to academia, only a few commercial processors have applied this technology, and it has been confined to specific application areas. We believe that the combination of dynamic optimization technology and speculative multi-threading (including thread-based prefetch mechanism) is a possible development trend in the future.
Secondly, the matching problem between the one-dimensional address space of von Neumann architecture and the multi-dimensional memory access level of multi-core processors. Essentially, the von Neumann architecture uses a one-dimensional address space. Data consistency issues are caused by uneven data access latency and different copies of the same data on multiple processor cores. Research in this area is divided into two major categories: The first type of research focuses on introducing new levels of fetching. The new fetching hierarchy may adopt a one-dimensional distributed implementation. A typical example is the addition of a distributed unified addressing register network. Global uniform addressing avoids data consistency considerations. At the same time, compared with the traditional large-capacity cache access, the register can provide faster access speed. Both TRIPS and RAW implement similar nuclear register networks. In addition, the new fetch hierarchy can also be private. For example, each processor core has its own private memory access space. The advantage is that the data storage space is better divided, and some local private data does not need to consider data consistency issues. For example, the Cell processor sets a private data buffer for each SPE core. Another type of research mainly involves the development of new cache consistency protocols. The important trend is to relax the relationship between correctness and performance. For example, it is speculated that the Cache protocol speculatively executes related instructions before data consistency is confirmed, thereby reducing the impact of long and late fetch operations on the pipeline. In addition, Token Coherence and TCC have adopted similar ideas.
Third, the diversity of programs and the matching of a single architecture. Future applications show diversity. On the one hand, the evaluation of the processor is not limited to performance, but also includes other indicators such as reliability and security. On the other hand, even considering the pursuit of performance alone, different applications have different levels of parallelism. The diversity of applications drives future processors with configurable and flexible architectures. TRIPS has made fruitful explorations in this regard. For example, its processor core and on-chip storage system have configurable capabilities, which enables TRIPS to simultaneously mine instruction-level parallelism, data-level parallelism, and instruction-level parallelism.
The emergence of new processing structures such as multi-core and Cell is not only a landmark event in the history of processor architecture, but also a subversion of traditional computing models and computer architectures.
In 2005, a series of far-reaching computer architectures were exposed, which may lay a fundamental foundation for the computer architecture of the next ten years, at least making a symbolic guide for the processor and the entire computer architecture. With the increase of computing density, the measurement standards and methods of processor and computer performance are changing. From an application perspective, the most satisfactory combination of mobile and biased performance has been found, and it may explode the handheld The rapid expansion of the equipment. Although handheld devices are also relatively popular now, in terms of computing power, scalability, and energy consumption, they have completely played the role that a handheld device should have. On the other hand, the performance-oriented server and desktop sides have begun to consider reducing power consumption. Catch up with the trend of a conservation-oriented society.
Cell itself adapts to this change, and it also creates this change itself. Therefore, it has emphasized different design styles from the beginning. In addition to being able to expand multiple times, the SPU (Synergistic Processor Unit) inside the processor has good scalability, so it can face both General and dedicated processing, enabling flexible reconfiguration of processing resources. This means that with proper software control, Cell can handle multiple types of processing tasks, while also reducing the complexity of the design.

IN OTHER LANGUAGES

Was this article helpful? Thanks for the feedback Thanks for the feedback

How can we help? How can we help?