What Is a Platform Frame?
Website architecture is generally considered to be based on the results of customer demand analysis to accurately locate the target group of the website, set the overall structure of the website, plan and design the website columns and their content, and formulate the website development process and sequence to maximize efficient resource allocation and Design of management. Its content has three manifestations: program architecture, presentation architecture, and information architecture. The steps are mainly divided into two steps of hard and soft architecture. Network architecture is a necessary basic technology for modern network learning and development.
Website architecture
- HTML static
- In fact, everyone knows that the most efficient and least consumed is purely static html pages, so we try to make the pages on our website architecture use static pages as much as possible. This simplest method is actually the most effective method. However, for websites with a large amount of content and frequent updates, we cannot all implement them one by one manually, so our common information publishing system CMS, such as the news channels of various portal sites we often visit, and even their other channels are all through It is managed and implemented by an information publishing system. The information publishing system can achieve the simplest information entry and automatically generate static pages. It can also have channel management, rights management, and automatic crawling functions. For a large website, it has a set of efficient A manageable CMS is essential.
- In addition to portals and information-publishing websites, for community-type websites that require high interactivity, static as much as possible is also a necessary means to improve performance. Posts and articles in the community are statically updated in real time. Re-staticization at that time is also a strategy that is used extensively, such as Mop's hodgepodge uses this strategy, as is the NetEase community.
- At the same time, html static is also used by some caching strategies. For applications that frequently use database queries in the system but the content is updated very little, you can consider using html static to implement, such as the public settings information of forums in the forum. Most of the mainstream forums can be managed in the background and stored in the database. This information is actually called by the foreground program in large quantities, but the update frequency is very small. You can consider staticizing this part of the content when performing background updates. Access request.
- Picture server separation
- Everyone knows that for a web server, whether it is Apache, IIS or other containers, images are the most resource-consuming, so we need to separate the images from the pages. This is basically a strategy that large websites will adopt. They all have Independent image server, even many image servers. This architecture can reduce the pressure on the server system that provides page access requests, and can ensure that the system does not crash due to image problems. On the application server and the image server, different configuration optimizations can be performed. Less support and as few LoadModules as possible to ensure higher system consumption and execution efficiency.
- Picture hosting application
- Sometimes you upload pictures online, and some large websites need to host and deliver a large number of pictures, which is a challenge for building a cost-effective, high-availability, and low-latency (fast retrieval) architecture.
- In a picture system, users can upload pictures to a central server and request these pictures through a network connection or API, just like Flickr or Picasa. For simplicity, let's assume that this application contains only two core parts: uploading (writing) pictures and retrieving pictures. It is best to be efficient when uploading pictures, and the transmission speed is also our biggest concern. When someone sends a request to the picture (for example, a web page or other application). This is a very similar function that provides web services or content distribution networks (a CDN server can store content in many places, so it is closer to the user, either geographically or physically, resulting in faster performance) edge servers.
- Database cluster and library table hashing
- Large websites have complex applications. These applications must use a database. When faced with a large number of accesses, the bottleneck of the database can quickly appear. At this time, a database will not be able to meet the application quickly, so we need to use a database Cluster or library table hash. In terms of database clusters, many databases have their own solutions. Oracle, Sybase, etc. have very good solutions. The commonly used Master / Slave provided by MySQL is also a similar solution. What kind of DB do you use, refer to the corresponding The solution can be implemented.
- The database cluster mentioned above is limited by the type of DB used in terms of architecture, cost, and scalability. Therefore, we need to consider improving the system architecture from the perspective of the application. Library table hashing is a commonly used and most effective solution. . We install business and application or function modules in the application to separate the database. Different modules correspond to different databases or tables, and then a smaller database hash of a page or function is performed according to a certain strategy, such as the user table. Table hashing is performed according to the user ID, which can improve the performance of the system at low cost and has good scalability. Sohu's forum uses such a structure to separate the database of users, settings, and posts in the database, and then hash the database and tables for posts and users according to the section and ID. Finally, simple configuration can be performed in the configuration file. This allows the system to add a low-cost database at any time to supplement system performance.
- Caching
- The term cache has been used in technical terms, and caches are used in many places. Website architecture and caching in website development are also very important. Here are the two most basic types of caching. Advanced and distributed caches are described later.
- Architectural caching, anyone familiar with Apache can know that Apache provides its own caching module, or you can use the additional Squid module for caching, both of which can effectively improve Apache's access response capabilities.
- For website program development, MemoryCache provided on Linux is a commonly used cache interface and can be used in web development. For example, when developing in Java, you can call MemoryCache to cache and communicate and share some data. Some large communities use this. Architecture. In addition, when using the web language development, various languages basically have their own cache modules and methods, PHP has Pear's Cache module, Java is more, .net is not very familiar, I believe there must be. [1]
- Mirror
- Mirroring is a method often used by large websites to improve performance and data security. The technology of mirroring can solve the difference in user access speed caused by different network access providers and regions. For example, the difference between ChinaNet and EduNet has prompted many websites to A mirror site is set up in the education network, and the data is updated regularly or in real time. In terms of the technical details of mirroring, I won't go into too much detail here, there are many professional ready-made solution architectures and products to choose from. There are also cheap ideas for software implementation, such as tools such as rsync on Linux.
- Load balancing
- Load balancing will be the ultimate solution for large websites to resolve high load access and a large number of concurrent requests.
- Load balancing technology has been developed for many years. There are many professional service providers and products to choose from. I have personally touched on some solutions. There are two architectures for your reference.
- Layer 4 hardware switching
- The fourth layer exchange uses the header information of the third and fourth layer information packets, identifies the service flow according to the application interval, and distributes the entire interval segment's business flow to the appropriate application server for processing. Layer 4 switching functions are like virtual IPs, pointing to physical servers. It transmits a variety of business-compliant protocols, including HTTP, FTP, NFS, Telnet, or other protocols. These services require complex load balancing algorithms based on physical servers. In the IP world, the type of service is determined by the terminal's TCP or UDP port address, and the application range in the layer 4 exchange is determined by the source and terminal IP addresses, TCP, and UDP ports.
- In the field of hardware four-layer switching products, there are some well-known products to choose from, such as Alteon, F5, etc. These products are expensive, but value for money, can provide very good performance and flexible management capabilities. Yahoo China was able to get close to 2,000 servers with three or four Alteons.
- Software Layer 4 Switching
- After everyone knows the principle of the hardware four-layer switch, the software four-layer exchange based on the OSI model has emerged at the historic moment. The principle of this solution is the same, but the performance is slightly worse. However, it is easy to meet a certain amount of pressure. Some people say that the software implementation is more flexible, and the processing power depends on the familiarity of your configuration.
- Software Layer 4 switching can be solved by using LVS commonly used on Linux. LVS is Linux VirtualServer. It provides a real-time disaster response solution based on heartbeat, which improves the robustness of the system. It also provides flexible virtual VIP configuration And management functions can meet multiple application requirements at the same time, which is essential for distributed systems.
- A typical strategy for using load balancing is to build a squid cluster on the basis of software or hardware four-layer exchange. This idea is adopted on many large websites including search engines. This architecture has low cost, high performance, and strong It is very easy to add or remove nodes to the architecture at any time. I am going to leave this structure empty to discuss it in detail.
- The WEB layer, a relatively stressful website, is now replaced by a WEB application server. In fact, its anti-concurrency capability has indeed exceeded expectations. At the file server level, due to the website's later propaganda, its reputation has also grown, and its PV value has become higher and higher, and it has become more and more unable to withstand the pressure. The pressure of the database layer ensures the high availability of the data, of course, the price is also very expensive. [1]
- For large websites, each of the methods mentioned above may be used at the same time. I introduce them here briefly. Many details in the specific implementation process still need to be familiarized and experienced by everyone. Sometimes a small squid parameter or The setting of apache parameters will have a great impact on system performance. I hope everyone will discuss it together to achieve the same effect.