What Is Code Refactoring?
Code refactoring is to improve the internal structure of a software system without changing the external behavior of the software system.
- Chinese name
- Refactoring
- Foreign name
- Refactoring
- Explanation
- Is not to change the existing software functions
- Features
- Make the code more understandable
- Code refactoring is to improve the internal structure of a software system without changing the external behavior of the software system.
- Software refactoring needs to be done with the help of tools. Refactoring tools can modify the code and modify all places where the code is referenced. In the methodology of Extreme Programming, refactoring requires unit tests to support it.
Overview of code refactoring
- Refactoring (), to improve the quality and performance of the software by adjusting the program code, make the design pattern and architecture of the program more reasonable, and improve the scalability and maintainability of the software. Some people may ask, why not spend more time on the design at the beginning of the project, but spend time refactoring later? Know that a design that is perfectly perfect to anticipate any future changes, or a design that is flexible enough to accommodate any expansion does not exist. System designers often can only control the upcoming projects from a general direction, but can not know every detail, and the second is always the change. The users who put forward the requirements often start to "pin their feet" after the software is formed. After all, the system designer is not a god of prophecy, and changes in functions will inevitably lead to design adjustments. Therefore, "test first, continuous refactoring" has been adopted by more and more people as a good development practice. Testing and refactoring like the berm of the Yellow River have become a magic weapon to ensure software quality.
Refactoring Why refactoring code (Refactoring)
- Change the way the system is implemented without changing the system's functions. Why do you do this? Investing energy is not used to meet the needs that customers care about, but only changes the software implementation. Is this a waste of customer investment?
- The importance of refactoring starts from the software life cycle. Software is different from ordinary products. It is an intellectual product and has no specific physical form. No physical loss can occur in a software, and the buttons on the interface will never cause poor contact due to too many pressings. So why can't a software be used forever after it is manufactured?
- There is only one factor that threatens the life of the software: changes in requirements. A software is always produced to solve a specific need, the times are developing, and the customer's business is also changing. Some needs are relatively stable, some have changed more drastically, and others have disappeared, or transformed into other needs. In this case, the software must be changed accordingly.
- Considering factors such as cost and time, of course, not all requirements changes need to be implemented in software systems. But in general, software needs to adapt to changes in demand to maintain its own vitality.
- This creates a terrible phenomenon: software products were originally manufactured, carefully designed, and well-architected. However, with the development of time and changes in requirements, it is necessary to constantly modify the original functions and add new functions, and it is inevitable that there are some defects that need to be modified. In order to implement changes, it is inevitable to violate the original design framework. After a period of time, the software's architecture has become riddled with holes. There are more and more bugs, more and more difficult to maintain, new requirements are more and more difficult to implement, and the software architecture gradually loses support for new requirements, but becomes a constraint. Finally, the cost of developing new requirements will exceed the cost of developing a new piece of software. This is when the life of this software system comes to an end.
- Refactoring can avoid such a phenomenon to the greatest extent. After the system has developed to a certain stage, it uses a refactoring method without changing the external functions of the system, and only reorganizes the internal structure. Through reconfiguration, the structure of the system is continuously adjusted so that the system always has a strong ability to adapt to changes in requirements.
- Refactoring can achieve the following goals:
- · Continuous correction and improvement of software design
- Refactoring and design are complementary and complementary to design. With refactoring, you still have to do the pre-design, but it doesn't have to be the optimal design. It only needs a reasonable solution. Without refactoring, the program design will gradually deteriorate and become more and more broken. The kite, the runaway mustang cannot be controlled. Refactoring is actually organizing the code and returning all code with a tendency to diverge.
- ·
- Martin Flower has a classic saying in "Refactoring": "Any fool can write a program that the computer can understand, and only a program that is easy for humans to understand is an excellent programmer." I feel deeply about this. Some programmers are always able to write executable code quickly, but the obscure naming in the code makes people dizzy and they need to hold the chair armrest tightly. Imagine a recruit arrives to take over such code. Will he want to be a deserter?
- The software life cycle often requires multiple batches of programmers to maintain, and we often ignore these latecomers. In order to make the code easily understandable by others, many additional events need to be done when implementing software functions, such as clear typographic layout, concise comments, and naming is also an important aspect. A good way is to use metaphoric naming, that is, to use the basis of the functions implemented by the object, to name it visually or anthropomorphically. A good attitude is to name each code element like a newborn. The tendency of naming paranoia, if honored by this nickname, will be deeply fortunate.
- For those names that are full of confusion and even misleading, you need to be determined and resolute, and never show mercy!
- Helps find hidden code defects
- Confucius said: Learn from the old and learn from the new. Refactoring code forces you to better understand the code you originally wrote. I often write a program, but I do nt understand the logic of my program. I was thrilled, but later I found that this symptom is a cold that many programmers often suffer from. When you also have this situation, you can deepen your understanding of the original design by refactoring the code, discover the problems and hidden dangers, and build better code.
- In the long run, it helps to improve programming efficiency
- When you find that solving a problem becomes extremely complicated, it is often not the problem itself, but you have used the wrong method. Poor design often leads to bloated coding.
- Improving design, improving readability, and reducing defects are all about staying in place. Good design is half the battle. Stopping to improve the design through refactoring may slow down the speed at present, but the latecomer advantages it brings can not be underestimated.
Refactoring When to Refactor Code (Refactoring)
- The new officer took three fires and started a brand-new, non-stop, overtime, an immense army of "yards" surrounded by programmers' passion and slamming the keyboard, struggling to advance, like a bamboo shoot, Siege the city and point to "Huanglong House".
- The development manager is the commander of this huge and powerful code team. He is in charge of the fate of this team. When Qi Xionggong stood on the top of the mountain and saw the team trained by Guan Zhong going uniformly, he lamented, "I have such a Where is the army afraid of no victory? ". But unfortunately, the team in your hands was originally a horde of warriors, recruiting and buying horses in the process of advancement, and continued to grow, so the transformation of the team is inevitable. When the development manager notices that the team is deformed, it may be time to stop the temptation to overcome the mountain ahead and stop to rectify the team.
- Kent Beck proposed the "bad smell of code", which has the same meaning as the "team deformation" we proposed. What is the signal of team deformation? The following code symptoms are a strong signal of "team deformation":
- · Duplicate code in the code
- There are 118 OEMs in China, the number of which is almost equal to the sum of all auto manufacturers in the US, Japan, and Europe, but the annual output of the country is less than that of a large foreign auto company. Duplicate construction will only lead to inefficiency and waste of resources.
- The program code cannot be duplicated. If the same code block exists in the same class, please refine it into an independent method of the class. If the same code exists in different classes, please refine it into a new class. Forever Don't repeat the code.
- Oversized classes and methods
- Oversized classes are often the result of irrational class abstraction. Irrational class abstraction will reduce code reuse. The method is the vassal states in the quasi kingdoms, which are too big to shake the centralized power. Overly long methods have a straightforward error probability due to the complexity of the logic involved, while their readability plummets. The robustness of the class is easily broken. When you see an overly long method, you need to find a way to divide it into multiple small methods so that you can divide and conquer.
- · Modifications that require a whole body
- When you find that you modify a small function or add a small function, a code earthquake is triggered. Maybe your design is not ideal enough and the function code is too scattered.
- · Too much communication between classes
- Class A needs to call too many methods of class B to access B's internal data. In terms of relationship, these two classes seem a little bit indifferent. Maybe these two classes should be together instead of separated.
- · Overcoupled information chains
- "Computer is such a science, it believes that any problem can be solved by adding an intermediate layer", so often the intermediate layer will be added to the program too much. If you see in the code that you need to get a piece of information, you need a method of one class to call a method of another class, which is hooked up layer by layer, like a pipeline. This is often caused by too many connection layers. You need to check whether there is a removable middle layer or if you can provide a more direct calling method.
- · Every hillside revolution
- If you find that two classes or two methods have different or similar names but have similar or identical functions, you will find that it is often caused by insufficient coordination of the development team. I used to write a very useful string processing class, but because I didn't notify other members of the team in time, I later found that there are actually three string processing classes in the project. Revolutionary resources are precious, and we should not stand on separate hills for revolution.
- · Imperfect design
- In a comparison and alarm project that the author just completed, Arjun was arranged to develop an alarm module, that is, to send alarm message information to the designated SMS platform, voice platform, and client alarm plug-in via Socket. Archu completed this excellently. Tasks. Later, the user put forward the requirement of real-time comparison, that is, the third-party system is required to send a request to the comparison alarm system in the form of a message, and the comparison alarm system receives and responds to this request. This requires the use of Socket message communication. Since the original design did not separate the message communication module, it was impossible to reuse the code developed by Azhu. Later, I adjusted this design in time, and added a message transceiver module, so that all external communications of the system reused this module, and the overall design of the system seemed more reasonable.
- Each system has a more or less imperfect design, which may not be noticed at the beginning, but will gradually become apparent later. At this time, only the courage to change is the best way out.
- Missing necessary comments
- Although many software engineering books often remind programmers of the need to prevent excessive commenting, this worry does not seem to be necessary. Often programmers are more interested in function implementation than code comments, because the former can bring a sense of accomplishment, so code comments are not too much but too few, too simple. The descending slope of a person's memory curve is very scary. When you go back and add comments after a period of time, it is easy to occur "forgetting to write, forgetting to stop talking".
- I have seen Microsoft's code comments on the Internet, the level of detail is amazing, and I have realized an experience of Microsoft's success.
Refactoring Code Refactoring Difficulties
- When you learn a new technology that can dramatically increase productivity, it's always hard to tell where it doesn't apply. Usually you learn it in a specific scenario, which is often a project. In this case, it is difficult to see what will cause this new technology to be ineffective or even harmful. A decade ago, the same was true of object tech. At that time, if someone asked me "when not to use an object," it would be difficult for me to answer. It's not that I think the object is perfect and has no limitationsI'm most opposed to this blind attitude, but that although I know its benefits, I really don't know where its limitations are.
- This is also the case for reconstruction. We know the benefits of refactoring, and we know that refactoring can bring about change in our work. But we haven't gained enough experience, and we can't see its limitations yet.
- This section is shorter than I expected. For the time being. As more people learn refactoring techniques, we will also try to refactor it and get the benefits it provides, but at the same time, you should also monitor the process from time to time and pay attention to the possible problem. Please let us know what you are experiencing. As we learn more about refactoring, we will find more solutions and know exactly which problems are really difficult to solve.
- · Databases
- One area where refactoring is often problematic is databases. Most commercial programs are tightly coupled with the database schema (database table structure) behind them, which is one of the reasons why the database schema is so difficult to modify. Another reason is data migration. Even if you are very careful to layer the system to minimize the dependency between the database schema and the object model, the change of the database schema still makes you have to migrate all the data, which can be a lengthy and Tedious work.
- In "nonobject databases", one way to solve this problem is to insert a separate layer between the object model and the database model, which can isolate Changes in each model. There is no need to upgrade another model at the same time when you upgrade one model, you only need to upgrade the above separation layer. Such a separation layer increases the complexity of the system, but can give you a lot of flexibility. If you have multiple databases at the same time, or if the database model is more complex and difficult to control, this separation layer is important even without refactoring.
- You don't need to insert a separation layer in the first place, you can generate it when you find that the object model becomes unstable. This way you can find the best leverage for your change.
- For developers, object databases can be both helpful and hindering. Some object-oriented databases provide automatic migration between different versions of objects, which reduces the workload during data migration, but still loses some time. If the data migration between databases is not automatic, you have to complete the migration yourself, which is a lot of work. In this case, you must pay more attention to the data structure changes in the classes. You can still safely transfer the behavior of classes, but you must be careful when transferring fields. Before the data is transferred, you must first use accessors to create the illusion that "data has been transferred." Once you know for sure where the data should be, you can migrate the data in one go. At this time, the only accessors that need to be modified are accessors, which also reduces the risk of errors.
- · Changing Interfaces
- Another important thing about objects is that they allow you to modify the implementation and interface of a software module separately. You can safely modify the inside of an object without affecting others, but be particularly cautious about interfaces anything can happen if the interface is modified.
- One thing that has been troublesome for refactoring is that many refactoring techniques do modify interfaces. All that a simple refactoring method like Rename Method (273) does is modify the interface. What effect does this have on the extremely precious packaging concept?
- If all the calling actions of a function are under your control, there will be no problem even if you modify the function name. Even in the face of a public function, as long as you can get and modify all its callers, you can safely rename this function. Only when the interface that needs to be modified is used by the code that cannot be found, even if it is found, the modification of the interface will become a problem. If this is the case, I would say: This interface is a "published interface"-a step further than the public interface. Once an interface is released, you can no longer simply modify the caller to safely modify the interface. You need a slightly more complicated procedure.
- This idea changed our problem. The question now is: how to deal with the refactoring methods that must modify the "published interface"?
- In short, if refactoring changes the published interface, you must maintain both the old and new interfaces until all your users have time to react to the change. Fortunately, this is not too difficult. You usually have a way to organize things and keep the old interface working. Try to do this: let the old interface call the new interface. When you want to change a function name, leave the old function and let it call the new function. Don't copy the function implementation code, it will make you stuck in the "duplicated code" mud. You should also mark the old interface as "deprecated" using the deprecation facility provided by Java. Then your caller will notice it.
- A good example of this process is the Java container classes. Java 2's new container replaces some of the original containers. When the Java 2 container was released, JavaSoft worked hard to provide a smooth migration path for developers.
- The "keep old interface" approach is usually feasible, but annoying. At least for a while you have to build and maintain some extra functions. They complicate the interface and make it difficult to use. Fortunately, we have another option: don't publish the interface. Of course I don't mean to ban it completely, because obviously you have to publish some interfaces. If you are building APIs for external use, like Sun did, you must publish interfaces. The reason why I say try not to publish is because I often see some development teams expose too many interfaces. I used to see a three-person team work like this: everyone publicly published an interface to two others. This makes them often have to maintain the interface back and forth. In fact, they could have directly entered the program library and modified the part they managed. Teams that place too much emphasis on "code ownership" often make this mistake. Publishing interfaces is useful, but it comes at a price. So don't publish interfaces unless really necessary. This may mean that you need to change your concept of code ownership so that everyone can modify other people's code to cope with interface changes. It's usually a good idea to do all this with Pair Programming.
- Don't publish the interface too early. Please modify your code ownership policy to make refactoring smoother.
- There is also a special problem in Java about modifying interfaces: adding an exception to the throws clause. This is not a modification of the signature, so you cannot hide it with delegation. But if the user code is not modified accordingly, the compiler will not let it pass. This problem is difficult to solve. You can choose a new name for this function, tion (controllable exception) and convert it into an unchecked exception. You can also throw an unchecked exception, but then you will lose the ability to check. If you do that, you can warn the caller that this unchecked exception will become a checked exception in the future. This gives them time to add exception handling to their code. For this reason, I always like to define a superclass exception for the entire package (like the SQLException of java.sql), and make sure that all public functions declare this exception only in their throws clause. In this way, I can define subclass exceptions at will, without affecting the caller, because the caller always only knows the more general superclass exception.
- Design changes that are difficult to accomplish through refactoring
- Can refactoring eliminate all design errors? Are there some core design decisions that cannot be modified by refactoring? In this area, our statistics are not complete. Of course, in some cases we can effectively refactor, which often surprises us, but there are some things that are difficult to refactor. For example, in a project, it is difficult (but still possible) to restructure a "system constructed without no security requirements" into a "good security system."
- In this case, my approach is "first imagine the situation of refactoring." When thinking about candidate designs, I ask myself: How difficult is it to refactor one design into another? If it seems simple, I don't have to worry too much about choosing the right one, so I'll choose the simplest design, even if it doesn't cover all potential needs. But if I don't see simple refactoring methods in advance, I will put more effort into the design. However, I find that this rarely happens.
- When shouldn't refactor?
- Sometimes you shouldn't refactor at all-for example when you should rewrite all your code. Sometimes the existing code is too confusing, and refactoring it is not as simple as writing a new one. Making such a decision is difficult, and I admit that I don't have good guidelines for when to restructure.
- A clear signal for rewriting (rather than refactoring) is that existing code is not working at all. You might just try to do some testing, and then you find that the code is full of errors and it can't work stably at all. Remember, before refactoring, the code must work at least in most cases.
- A compromise is to refactor the "big software" into "small components with good packaging." Then you can make a "refactor or rebuild" decision on a component by component basis. This is a promising approach, but I don't have enough data to write good guidelines. For an important ancient system, this will definitely be a good direction.
- Also, if the project is nearing its deadline, you should also avoid refactoring. At this time, the productivity gained from the restructuring process can only be reflected after the deadline has passed, and at that time it is no longer necessary. Ward Cunningham has a good idea about this. He described the unfinished reconstruction work as "debt." Many companies need to borrow to make themselves run more efficiently. But borrowing has to pay interest, and the "extra cost of maintenance and expansion" caused by overly complex code is interest. You can afford a certain amount of interest, but if the interest is too high you will be crushed. It is important to manage your debt well, and you should always restructure to pay off part of your debt.
- If the project is very close to the deadline, you should no longer be distracted by refactoring, because there is no time left. However, experience with multiple projects shows that refactoring does increase productivity. If you don't have enough time in the end, it usually means that you should actually refactor.
Refactoring Code Refactoring and Design
- Refactoring has a special task: it complements design. When I was new to programming, I buried my head and wrote the program, and developed it in a hurry. However, I soon realized that "upfront design" could save me the high cost of returning to work. So I quickly strengthened this "pre-designed" style. Many people think of design as a key part of software development, and programming as just mechanical low-level labor. They think design is like drawing engineering drawings and coding is like construction. But you know, software is very different from real devices. Software is more malleable, and it's a completely thought product. As Alistair Cockburn said, "With design, I can think faster, but it is full of small holes. A
- There is a view that refactoring can be a "pre-designed" alternative. This means that you don't have to do any design at all, just start coding according to the original idea, make the code work effectively, and then refactor it. In fact, this approach is really feasible. I did watch someone do it and ended up with well-designed software. Proponents of Extreme Programming [Beck, XP] strongly advocate this approach.
- Although, as mentioned above, only refactoring can be effective, this is not the most effective way. Yes, even Extreme Programming enthusiasts pre-design. They use CRC cards or similar to test different ideas before they get the first acceptable solution before they can start coding and then reconstruct. The point is: refactoring changes the role of "pre-design". If there is no refactoring, you must ensure that the "pre-design" is correct, which is too much pressure. This means that if any changes are needed to the original design in the future, the cost will be very high. So you need to spend more time and energy on the pre-design to avoid future changes.
- If you choose to refactor, the focus of the problem changes. You still do the pre-design, but you don't have to find the right solution. At this point you just need to get a reasonably adequate solution. You know for sure that when you implement this initial solution, your understanding of the problem will gradually deepen, and you may find that the optimal solution is slightly different from what you originally thought. As long as there is a reconstruction weapon in hand, it will not be a problem, because reconstruction makes the cost of future modification no longer high.
- This shift has one important consequence: software design has taken a big step toward simplicity. When I haven't used refactoring in the past, I always strive for flexible solutions. Any one of these requirements made me skeptical: How will this requirement change during the life of the system? Because changing the design is very expensive, I want to build a solution that is flexible enough and strong enough that it can withstand all the changes in requirements I can foresee. The problem is that the cost of building a flexible solution is difficult to estimate. Flexible solutions are much more complex than simple solutions, so the resulting software is often more difficult to maintain although it is in the direction I envisioned, you must also understand how to modify the design. If the change only occurs in one or two places, that's not a big deal. However, changes may actually occur throughout the system. If flexibility is established where all possible changes occur, the complexity and maintenance of the entire system will increase significantly. Of course, if all this flexibility is found to be unnecessary, this is the biggest failure. You know, there must be some flexibility that doesn't help, but you can't predict which ones will not. To get the flexibility you want, you have to add more flexibility than you actually need.
- With refactoring, you can deal with the risks of change in a different way. You still need to think about potential changes, and you still need to think about flexible solutions. But you don't have to implement these solutions one by one. Instead, you should ask yourself: "How difficult is it to reconstruct a simple solution into a flexible one? "If the answer is" quite easy "(most of the time), then you only need to implement the current simple solution.
- Refactoring can bring simpler designs without losing flexibility, which also reduces the difficulty of the design process and reduces design pressure. Once you feel more about the simplicity brought by refactoring, you don't even have to think about the so-called flexible solution in advance once you need it, you always have enough confidence to refactor. Yes, just build the simplest system that works. As for flexible and complex designs, alas, most of the time you won't need it.
- Nothing to gain Ron Jeffries
- The payment process for Chrysler Comprehensive Compensation is too slow. Although our development is not over, this problem has begun to haunt us because it has dragged down the speed of testing.
- Kent Beck, Martin Fowler, and I decided to solve this problem. While waiting for everyone to meet, with my overall understanding of the system, I began to speculate: What is it that makes the system slow? I thought of several possibilities, and then I talked with my partners about several possible modifications. Finally, we have some really good ideas about "how to make this system run faster".
- Then, we used Kent's measurement tools to measure the system performance. None of the possibilities I initially thought were the cause of the problem. We found that the system uses half the time to create a "date" instance. What's more interesting is that all these entities have the same value.
- So we looked at the logic of the date creation and found the opportunity to optimize it. Dates were originally created from string conversions, even without external input. The reason for using the string conversion method is entirely for the convenience of keyboard input. OK, maybe we can optimize it.
- So we observe how dates are used by this program. We found that many date objects are used to generate "date interval" instances. A "date interval" is an object that consists of a start date and an end date. Tracking it down, we find that most date ranges are empty!
- When dealing with date ranges, we follow this rule: if the end date is before the start date, the date range should be empty. This is a good rule, which fully meets the needs of this class. Soon after adopting this rule, we realized that creating a date range "start date after end date" is still not clear code, so we refined this behavior into a factory method. Patterns, see "Design Patterns"), which specifically creates "empty date intervals".
- We made the above changes to make the code clearer, but we were pleasantly surprised. We create a fixed "empty date interval" object and let the adjusted factory method return the object every time instead of creating a new object each time. This modification nearly doubled the system speed, enough to make the test speed acceptable. It only took us about five minutes.
- My teammates (Kent and Martin declined to participate) seriously speculated: what might be wrong in this program that we know well? We even made some improvement designs out of thin air, but did not measure the actual situation of the system first.
- We were totally wrong. Except for a very interesting conversation, we did nothing good.
- Lesson: Even if you fully understand the system, please actually measure its performance, don't speculate. Speculation will teach you something, but in all likelihood you are wrong.
Performance Code refactoring and performance (Performance)
Code refactoring annotations
- Translation note: In my exposure experience, the term performance has been interpreted and understood differently by different people: efficiency, performance, effectiveness. Different regions (such as Taiwan and mainland China) also have different usages. As soon as this book meets performance, I translate it into performance. efficient translates to efficient, effective translates to effective.
- Regarding refactoring, there is a frequently asked question: How will it affect the performance of the program? To make the software easy to understand, you often make changes that make the program slower. This is an important issue. I don't agree with neglecting program performance in order to improve the purity of the design or to pin the hope on faster hardware. A lot of software has been rejected by users because it is too slow, and increasing machine speeds have only slightly relaxed the speed restrictions. However, to put it another way, although refactoring inevitably makes software run slower, it also makes software performance optimization easier. Except for real-time systems that have strict requirements on performance, the secret to "writing fast software" in any other situation is to first write tunable software and then adjust it to get sufficient speed.
- I've seen three ways to "write fast software." The most stringent of these is "time budgeting", which is usually only used for real-time systems with extremely high performance requirements. If you use this method, budget your design and dedicate a certain amount of resources to each component in advanceincluding time and execution footprint. Each component must not exceed its own budget, even if it has a mechanism that "schedules time between different components." This method attaches great importance to performance and is necessary for systems such as heart rhythm regulators, because late data in such systems is wrong data. But for other types of systems (such as the enterprise information systems I often develop), such a pursuit of high performance is a bit excessive.
- The second method is "constant attention." This approach requires any programmer to do anything at any time to find ways to maintain the high performance of the system. This approach is common and feels attractive, but it usually doesn't help much. If any modification is to improve performance, the software that is finally obtained is indeed faster, then this loss is worth it. Unfortunately, it usually does not work, because once the performance improvement is spread to all corners of the program, each improvement is only It's just from a "narrow perspective on procedural behavior."
- One interesting thing about performance is that if you analyze most programs, you will find that it spends most of its time on less than half of the code. If you optimize all code without discrimination, 90% of the optimization work is a waste of effort, because many of the code you optimize is rarely executed. You spend time optimizing to make the program run faster, but if you spend time because you lack a clear understanding of the program, that time is wasted.
- The third performance improvement method uses the above-mentioned "90%" statistics. With this approach, you build your program in a "well-factored manner" without paying any attention to performance until you enter the performance optimization phase-which is usually late in development. Once in this phase, you adjust the performance of the program according to a specific program.
- In the performance optimization phase, you should first monitor the running of the program with a measurement tool and let it tell you where in the program consumes a lot of time and space. This way you can find a small piece of code where the hot spot is. Then you should focus on these performance hotspots and optimize them using the optimization methods in the "Continuing Concern Method" mentioned above. Because you focus on the hot spots, less work can show better results. Even so, you have to be cautious. As with refactoring, you should make minor changes. Every step needs to be compiled, tested, and measured again. If performance is not improved, this change should be undone. You should continue this process of "finding hot spots and removing hot spots" until you get satisfactory performance from your customers. About this technology, McConnell [McConnell] provides us with more information.
- A well-factored program can help this form of optimization in two ways. First of all, it gives you ample time to perform performance tuning, because with well-decomposed code in hand, you can add features faster and spend more time on performance issues (exact Measurement guarantees that you invest this time in the right place). Secondly, in the face of well-decomposed programs, you have a finer granularity when performing performance analysis, so the measurement tool takes you into a smaller range of program sections, and performance adjustment is easier. Because the code is clearer, you can better understand your choices and know which adjustments are key.
- I found that refactoring can help me write faster software. In the short run, refactoring does make software slower, but it makes it easier to adjust software performance during the optimization phase. In the end I still made money.
Code refactoring optimization
- Optimizing a Payroll System Rich Garzaniti
- We gave Chrysler Comprehensive Compensation a long time before we handed it over to GemStone. We inevitably found that the program was not fast enough during the development process, so we looked for Jim Haungs a good guy in GemSmith and asked him to help us optimize the system.
- Jim took a moment to let his team understand how the system works, and then wrote a performance measurement tool based on GemStone's ProfMonitor feature and inserted it into our functional test. This tool can show the number of objects generated by the system, and the birth point of these objects.
- To our surprise: the most created object was actually a string. The biggest workload is repeatedly generating 12,000-bytes of strings. This is special because the string is so large that even GemStone's usual garbage collection facilities cannot handle it. Because it is so huge, GemStone will paging it to disk whenever it is created. That is to say, the creation of a string actually uses the I / O subsystem (translation: the paging mechanism will use I / O), and each time a record is output, such a string must be generated three times.
- Our first solution was to cache a 12,000-bytes string, which would solve most of the problems. Later we modified it to write it directly to a file stream to avoid generating strings.
- After solving the "giant string" problem, Jim's measurement tool found some similar problems, but the strings were slightly smaller: 800-bytes, 500-bytes ... and so on, we have also used file for them. stream, so the problem is solved.
- Using these technologies, we have steadily improved system performance. The salary calculation that originally took more than 1,000 hours to complete during the development process took only 40 hours in actual operation. After a month we reduced the time to 18 hours. It only took 12 hours when it was officially put into operation. After one year of operation and improvement, the total calculation takes only 9 hours.
- Our biggest improvement is to put the program on a multi-processor calculator and run it in multiple threads. At first this system was not designed according to multi-threaded thinking, but because the code is well factored, we only needed three days to allow it to run multiple threads simultaneously. Now, salary calculation takes only 2 hours.
- We also guessed the problem before Jim provided the tools that allowed us to measure system performance in practice. But if only by guessing, we need a long time to try out the real solution. Real measurements point in a completely different direction and greatly accelerate our progress.