What Is a Letter Scale?
Reliability refers to the consistency, stability, and reliability of the test results. Generally, internal consistency is used to indicate the reliability of the test. The higher the reliability coefficient, the more consistent, stable and reliable the results of the test. The systematic error has little effect on the reliability, because the systematic error always affects the measured value in the same way, so it will not cause inconsistency. Conversely, random errors can cause inconsistencies, which can reduce reliability.
- Chinese name
- Reliability
- Foreign name
- reliability
- Definition
- Consistency and stability of test results
- Applied discipline
- psychology
- Reliability refers to the consistency, stability, and reliability of the test results. Generally, internal consistency is used to indicate the reliability of the test. The higher the reliability coefficient, the more consistent, stable and reliable the results of the test. The systematic error has little effect on the reliability, because the systematic error always affects the measured value in the same way, so it will not cause inconsistency. Conversely, random errors can cause inconsistencies, which can reduce reliability.
Reliability definition
- Reliability refers to the degree to which the results obtained when the same method is repeatedly measured on the same object are consistent. On the other hand, reliability refers to the reliability of the measured data.
- For example, for Question 1 of Part I of the Questionnaire on Library Utilization and Satisfaction, if the same person is 3 days apart, ask the same question
- [1]
Reliability formula
- If used
- If the measurement results
- Due to system error
- [1]
Reliability factor
Reliability Overview
- For random errors
Reliability definition
- Reliability factor
- [1]
Reliability disadvantage
- From the above calculation formula of reliability, since the sum of squared errors is involved, it is necessary to repeatedly measure
- [1]
Estimation of reliability
- Before estimating the reliability of the questionnaire, various subjective or objective alternative answers in the questionnaire need to be converted into a digital form by using an appropriate scale (such as Likert scale), and then the questionnaire is scored on this basis (Including individual scores, grouping scores and total scores of related topics, etc.).
- Common specific methods of reliability analysis include retest reliability, replica reliability, split-half reliability,
Retest reliability
- Retest reliability (test-retest reliability) is also called retest reliability. It is the same questionnaire for the same group of respondents, and they are investigated twice at different time points. . Retest reliability reflects the effect of random errors.
- The source of the error examined in the retest reliability is the random effect of time. When evaluating the retest reliability, attention must be paid to the time of the retest interval. For personality tests, retesting intervals are appropriate between two weeks and six months.
- In the evaluation of retest reliability, the following two important issues should also be paid attention to: Retest reliability generally only reflects changes caused by random factors, and does not reflect long-term changes in subject behavior. Different behaviors are affected differently by random errors.
- Disadvantages: There is a dilemma in retesting reliability. Shortening the interval between two tests makes it easier for the test subject to recall the test subject; while extending the interval between two tests makes the test subject more likely to change due to external influences. [2]
Replica reliability
- Parallel-forms Reliability, also known as equivalence coefficient, is a type of Equivalence Reliability, which refers to the degree of variation of the questionnaire results compared to another very similar The same group of people surveyed using two questionnaires with equivalent content but different topics, and then compared the degree of correlation between the two groups of data.
- It has a greater workload than weight measurement reliability, because the same measurement tool (questionnaire, psychological scale, etc.) needs to build two equivalent copies, and the two copies must contain the same number, type, content, and difficulty. topic. To evaluate the reliability of replicas, two replicates are used to test the same group of subjects, and then the correlation coefficient between the two replicate measurement scores is estimated. The larger the correlation coefficient, the smaller the variation caused by the composition of the two replicates. This is different from the time-dependent variation in the reliability of the retesting, that is, the correlation coefficient reflects the degree of equivalence of the measurement score, so the replica reliability is also called the equivalence reliability.
- The main advantages of replica reliability are: it can avoid some problems of retesting reliability, such as memory effect, practice effect, etc .; it is suitable for long-term follow-up research or investigating the impact of certain interference variables on test results; reduced coaching Or the possibility of cheating. The limitations of replica reliability are: If the behavior of the measurement is susceptible to practice, the reliability of the replica can only be reduced and this effect cannot be eliminated; The nature of some tests may change due to repetition; some It is difficult to find a suitable copy for the quiz. [2]
Internal consistency reliability
- The main reflection is between the internal questions of the test
- Reliability
- 1. Split-half reliability refers to the degree of variation in the results of the two halves of the questionnaire in a survey. The reliability coefficient is obtained by dividing the test into two halves and calculating the correlation between the two halves. The longer the test, the higher the reliability coefficient. The correction formula is the Spearman-Brown formula. The Spearman-Brown formula is an empirical formula for correcting half-reliability (rtt = 2rhh / (1 + rhh). Generally, if the test is divided into two halves, the half-coefficient rtt is 0.5)
- It assumes that the halves of the test scores have the same number of variations. When the assumption is not true, one of the Flanagan formula or the Kulon formula can be used to directly obtain the reliability coefficient of the test.
- 2. Homogeneity reliability refers to the extent to which each topic in the test examines the same content. When the homogeneity reliability is low, even if each test question seems to measure the same trait, the test is actually heterogeneous, that is, the test measures more than one trait. Homogeneity analysis is similar to internal consistency analysis in project analysis. Several formulas to calculate the homogeneity reliability: (1) Kud-Richardson formula (2) Klumbach coefficient. For some complex and heterogeneous psychological variables, it is not feasible to use a single test of homogeneity, so several relatively heterogeneous subtests are often used.
Reliability score
- Refers to the consistency when different raters rate the same object. The simplest estimation method is to randomly select several answer sheets, score them by two independent raters, and then find the correlation coefficient between the two evaluation scores of each answer sheet. This correlation coefficient can be calculated using the product difference correlation method or the Spearman rank correlation method. [2]
The relationship between reliability and validity
- There are obvious differences between the reliability and validity of the questionnaire, but there are also interrelated and mutually restricted relationships. Reliability mainly answers questions about the consistency, stability, and reliability of measurement results; validity mainly answers questions about the validity and correctness of measurement results.
- The relationship between validity and reliability can be understood by the formula of the measured value O = TSR. If the measurement is completely valid, that is, 0 = T, S = 0, R = 0, the measurement must be completely reliable at this time. If the reliability of the scale is insufficient, it may not be completely effective because O = TR . If the scale is completely credible, it can be completely effective, or it may not be reached, because there may be errors. Although the lack of reliability necessarily lacks validity, the size of the reliability cannot reflect the size of the validity. Reliability is a necessary condition for validity, but it is not a sufficient condition. From a theoretical perspective, the quantity should have sufficient validity and reliability; from a practical point of view, a good scale should also be practical. Practicality refers to the economy, convenience, and interpretability of the scale.
- In general, reliability is a necessary condition for validity, that is, validity must be based on reliability; but there is no measurement of validity, even if its reliability is high, such a measurement is meaningless of. The relationship between reliability and validity has the following types:
- Credible and effective
- This questionnaire accurately reflects the true attitude of the surveyed people. The questions in the questionnaire are closely related to the survey objectives. This situation is shown in Figure 8-8 (a). The solid points at (x, y) in the figure represent the real situation of the phenomenon to be measured, and the remaining points represent the measurement results obtained through investigation. If the survey results can truly reflect the surveyed objects and the measurement error is small, it means that the results of the questionnaire survey are credible and effective.
- Trustworthy but invalid
- Although the results of this questionnaire survey can accurately reflect the true attitudes of the people surveyed, the questions in the questionnaire are weakly related to the real purpose of the survey and are not consistent with the goals of the survey. As shown in Figure 8-8 (b). This situation shows that although the results obtained in the survey are credible, errors may have been made in some aspects. For example, the design of the questionnaire in the questionnaire caused all the surveyees to have a deviation in understanding, which resulted in a systematic Sexual deviation.
- Untrustworthy and invalid
- In this case, the results of the statistical survey are scattered, and it is difficult to obtain valid results from the questionnaire. This is the type that should be avoided in the measurement. As shown in Figure 8-8 (c).