What Is the Purpose of Standardized Tests in Schools?

The standardized test is also called standardized test, which means that according to unified and standardized standards, all aspects of the test, including test purpose, proposition, test, scoring, scoring, and score interpretation, are organized according to systematic scientific procedures. Thus, the examination of errors is strictly controlled.

Standardized test


Examination is a measure of a person's psychological characteristics, and it is the inference of the overall behavior of an examinee by observing a sample of the examinee's behavior. Since it is a measurement, there must be errors. This is the case with physical measurement, and the test as a psychological measurement is more susceptible to interference by irrelevant factors. Relevant theories tell us that only by minimizing the impact of these unrelated factors can the scientificity and fairness of the exam be guaranteed. Therefore, the standardization of examinations is an important way to achieve scientific and fair examinations and ensure the quality of examinations.
The so-called "standardization" refers to the process of minimizing test errors, including unified content, unified guidance, unified time limit, unified scoring, establishing a norm, collecting reliability and validity data, and so on. (Xie Xiaoqing, 1988) The book "Standardized Examination" edited by the former National Education Commission Examination Management Center roughly divides each link of the standardized test into "standardization of test preparation, standardization of test implementation, standardization of scoring and standardization of score conversion and interpretation" . Some scholars believe that more than the above-mentioned links need to be standardized. In addition to "strictly controlling errors in the four links of proposition, test, scoring and interpretation of scores", it is also necessary to "achieve test question prediction, DIF analysis and score equivalent, Reasonably determine the test length and passing line. " (Xu Jing et al., 2004) In short, standardized tests are to control every aspect of the test so that it is performed in accordance with certain standards, so as to eliminate as far as possible error factors unrelated to the purpose of the test, so that the individual differences between testers The percentage of differences in test scores is greatest.
As the understanding of standardized tests is unclear, many people have misunderstood it. As an examination researcher, we need to introduce the so-called standardization issues to the public.
1. Standardized tests are not the same as multiple-choice questions. When it comes to standardized tests, many people may think of multiple choice questions of one out of four, with standard single answers. Many critics believe that this "standard" form stifles the examinee's creativity and fails to test the examinee's true ability. It is a "mechanization, formulation, conceptualization" examination method. In fact, standardized tests are not named because they have standard answers, and the types of questions are not limited to multiple-choice questions.
Multiple-choice questions were invented by Otis, AS during the First World War, and were widely used in various standardized tests. The multiple-choice questions greatly reduce the cost of scoring and scoring errors, and expand the coverage of test questions, reduce sampling errors, and improve the validity of the test. Many people find that the multiple choice questions are more rigid, and it is difficult to test the candidates' true ability. But in fact, multiple-choice questions "can measure not only general-level learning outcomes, but also high-level competencies in understanding, using, analyzing, synthesizing, and evaluating" (Zhang Minqiang, 1998) If this type of question must have certain deficiencies, It mainly lies in the compilation of the topic itself. Low-quality multiple-choice questions can test the academic level of rote memorization; while high-quality multiple-choice questions can test the higher-level ability of candidates. There are many high-quality standardized tests (such as TOEFL, SAT, etc.) that continue to use multiple-choice questions, indicating that this type of question has certain advantages.
Of course, the author does not deny that multiple-choice questions have their own disadvantages:
(1) It takes time to prepare good multiple-choice questions, especially the establishment of interference items.
(2) This type of problem is difficult to measure important abilities such as expression and creativity, which are characterized by divergence.
(3) Unable to measure students' thinking process.
(4) This type of question is suspected of being answered correctly. (Zhang Minqiang, 1998)
Therefore, other standardized question types need to be added to the standardized test, as well as subjective question types such as question and answer, writing, etc., in order to comprehensively test the testee's ability in all aspects.
2. Subjective questions can also be standardized. Generally speaking, subjective tests are more difficult to standardize than objective questions. This is because first, from the perspective of propositions, subjective questions take longer to answer, and the number of questions is small. Sampling errors are prone to occur during propositions, which affects the validity of the test. For example, an essay entitled "Internet Age" may discriminate against candidates who have not been exposed to computers and the Internet.
Second, from the point of view of scoring, it is easy to produce scoring errors. Because subjective test questions generally do not have standard answers, only scoring standards. Different scorers will have different understandings of the scoring standards, and the scores given for the same answer sheet may even differ widely. On the one hand, this requires the examination institutions to conduct rigorous training for the main members of the raters, and to make their understanding of the grading standards as consistent as possible through test assessments; on the other hand, it is necessary to monitor the quality of the graders' review papers and control them. In order to reduce the subjective test error and make it truly "standardized".
3. The purpose of standardized tests is to improve efficiency. Standardized examinations, like modern education, are an inevitable result of mass production. Previously, there was a misconception that school education is to put all students on the same production line. This idea actually exaggerates the commonality of students and ignores the personality of students. But in modern society, school education can improve the efficiency of education, so that almost all members of society can enjoy the privileges that a small number of people used to enjoy. For this reason, a certain sacrifice is worth it. In fact, standardized tests are, in a sense, an act that sacrifice certain effects to gain efficiency. Because of the standardized examination, the efficiency of evaluation can be greatly improved, and the cost of evaluation can be reduced. For this reason, a certain sacrifice is inevitable. Exams may be invalid or unfair to some people, but in the society as a whole, exams are effective and fair. If social resources are extremely rich in the future, regardless of factors such as time and cost, standardized examinations may be replaced by more accurate interviews or long-term observations. However, recent actual conditions have confirmed that it is also difficult to achieve in the United States with relatively developed social and natural resources. . Therefore, under the current circumstances, especially in China, we cannot hold too high expectations for standardized examinations, considering that examinations are a perfect selection system. That is to say, examinations can and can only be used as a means to improve the efficiency of talent selection, and can only be an auxiliary means.

As we all know, the purpose of the exam is to provide test users with a reference for the decision through the interpretation of the test scores. Here, if the test is misused, it is an inappropriate interpretation of the test score. As a result, such improper interpretations can influence decision-making and thus undermine the validity of the exam. Therefore, we dare to say that there is no low-efficiency test in the world, only a low-efficiency score explanation.
At present, China's inappropriate interpretation of scores mainly includes the following:
1. Lack of a point reference system. At present, in the process of cognition and execution of examinations in China, when making normative reference interpretations of test scores, only the scores are given in isolation, and the relevant information of the norms is not given. The result of this is impossible. Interpret scores appropriately to influence decision making. For example, an examinee scored 80 points in an exam. Strictly speaking, it is impossible to make any judgment on the performance of the examiner based on this score alone. Because in terms of this result, he may be the best one or the worst one. But if you know the parameters that describe the norm, such as the average score and the standard deviation, then you can make a judgment on his performance. For example, the Webster IQ score is 100 as the average score and 15 as the standard deviation. If a person's IQ score is 115, it means that the person has a standard deviation higher than the average score, that is, his IQ is higher than about 84% of people. Such information is obviously better for decision makers.
2. Lack of the necessary description of the score. When making a standard reference interpretation of test scores, if only the scores are reported without the necessary description of the level achieved by the candidate, the decision maker cannot obtain sufficient information. Moreover, under normal circumstances, when people make a standard reference interpretation of the test, the average score and standard deviation are not important, so the test users do not care much. What is important is the setting of its standard or passing line, as well as a description of the ability of the participants to reach this standard. In this way, the decision maker knows what the candidate can do on the basis of knowing the candidate's ability. The TOEIC test developed by the American ETS describes the candidate's "can do" in a very detailed way when reporting the results, so that the test decision maker can have a clearer understanding of the candidate's ability.
3. Exaggerate the power of standardized tests. The functions of any test are limited, and the test examines the ability of a candidate in one or several aspects. Examinations are only effective if they are used to evaluate what they can measure, otherwise they are ineffective. The math test can only be used to evaluate candidates 'mathematical abilities. If used to evaluate candidates' Chinese abilities, it is not only invalid, but also absurd. Earlier, many colleges and universities in China linked English grades four and six with their graduation certificates. This practice actually assumed that a qualified college graduate must have a good level of English. Although China is now increasingly connected to the world and international exchanges are becoming more frequent, not all university graduates need to participate in international exchanges. For some graduates, perhaps English is rarely used in his work. Is it necessary for these people to have a good level of English? Imagine that a college graduate with excellent professional performance failed to obtain a graduation certificate because he failed to pass the English Level 4 and 6 exams, and thus missed a position that can exert his professional skills. Isn't it a shame? Therefore, the practice of these colleges linking their graduation certificates to English Levels 4 and 6 is actually a misuse of English Levels 4 and 6. For this reason, some people have criticized the English Level 4 and 6 exams, thinking that they should be cancelled. Although this view is too extreme, it also shows to a certain extent that in the relevant tests that determine a person's fate, we need to explain the scores of the English level 4 and 6 tests appropriately so that they can evaluate the students' English proficiency The effectiveness of such examinations should be exerted from time to time.

The so-called standardized examination, as the name implies, is an examination conducted in accordance with the standard. So by what standard? The American Educational Research Association, the Psychological Association, and the Educational Measurement Board have jointly developed the American "Education and Psychological Testing Standards". The standardized tests in the United States need to conform to this "standard" from the preparation, testing, scoring, and quality analysis. China does not have its own "standards". Therefore, strictly speaking, China does not have its own standardized tests at this stage. In many ways, we can only learn from the experience of other countries.
China is the hometown of examinations, but it lags far behind the West in modern examination technology. During the period from the end of the 19th century to the beginning of the 20th century, the development of Western experimental psychology and psychological tests promoted the development of examinations. From 1909 to 1915, educational quizzes gradually increased, and quiz research entered a period of prosperity. The Stanford Achievement Test came out in 1922 and gradually became popular. At that time, not only the development of subject tests, but also the development of diagnostic tests and practice tests, which formed a culture of using educational measurements to conduct educational research. In comparison, China's examination research on examinations started late, and the realm of education and psychometrics has gained attention after the reform and opening up. In the latter half of the last century, due to the development of science and technology, especially the popularity of computers, the western examination technology field has developed rapidly. In addition to traditional classical measurement theory, item response theory and generalization theory have also been widely used. At the same time, many new testing methods such as computer adaptive test (CAT), electronic rater (E-rater), and authentic test have appeared. The development of examination technology is changing with each passing day. All of this shows that China's examination research must also carry out a lot of broadening work.
In our country, thousands of exams of various types are held every year. The size of these exams is not small, and the benefits are not small. But these tests are truly standardized. The reason is that the author believes that it can be boiled down to two aspects: concept and system:
First of all, from the perspective of ideas, the general public in China has a "worship" psychology of examinations, and it is generally believed that examinations must be sacred and fair. If you miss some opportunities because the test results are worse than others, you must be convinced. This kind of psychology may stem from obedience to authority, or may be "blinded" by the form of examination. In any case, the form of the test not only guarantees the reliability of the test, but also the validity of the test, that is, whether an test can really test the ability of the test paper maker to test. This kind of psychology makes people almost never question the science and validity of the exam. The form of the examination makes people see only formal equality and fairness, but ignores the inequality and unfairness that may actually exist. Secondly, from the perspective of the system, most of China's examinations are organized by government departments or organizations affiliated with government departments. Therefore, these departments or organizations can "cultivate" or "divide" markets through administrative forces. In this case, the survival or strength of different examinations does not depend on the quality of the examinations themselves, but largely depends on administrative orders. Even the quality of the examinations lacks effective supervision, often only through the responsibility of the R & D staff of the examinations. In the end, under the general trend of China's socialist market economy, should the examination industry be placed in the market, so that the examination can withstand the test of practice, so that different examinations can be improved in competition with each other. An important reason for development. And these two aspects themselves influence each other. Examinations organized by government departments will increase the "authoritativeness" of the examinations because of their "official" status, so the public will not doubt its scientificity and effectiveness. Along with this, the public's unwavering conviction will inevitably reduce the supervision of the quality of the examination from another aspect, making the examination organization unintentionally relax the pursuit of the quality of the examination. Such a vicious circle will inevitably affect the development of standardized examinations. To this end, we must first vigorously publicize the concept of standardized examinations, improve the general public's awareness of examinations, and then promote the gradual reform of the examination system. In this way, we can continuously improve the implementation of standardized examinations and truly realize the purpose of using examinations to achieve talent assessment.
As a means of talent assessment, standardized examinations are playing an increasingly important role in China. However, standardized examinations are just one of the means of talent assessment. Therefore, we can neither underestimate the role of standardized examinations nor overestimate its role. The attitude of "praising" the test and "killing the test with a stick" is not desirable. There is still a long process of formation and development for the implementation and true implementation of standardized examinations. [1]

IN OTHER LANGUAGES

Was this article helpful? Thanks for the feedback Thanks for the feedback

How can we help? How can we help?