What Is a Letter Scale?

Reliability refers to the consistency, stability, and reliability of the test results. Generally, internal consistency is used to indicate the reliability of the test. The higher the reliability coefficient, the more consistent, stable and reliable the results of the test. The systematic error has little effect on the reliability, because the systematic error always affects the measured value in the same way, so it will not cause inconsistency. Conversely, random errors can cause inconsistencies, which can reduce reliability.

Reliability refers to the consistency, stability, and reliability of the test results. Generally, internal consistency is used to indicate the reliability of the test. The higher the reliability coefficient, the more consistent, stable and reliable the results of the test. The systematic error has little effect on the reliability, because the systematic error always affects the measured value in the same way, so it will not cause inconsistency. Conversely, random errors can cause inconsistencies, which can reduce reliability.
Chinese name
Reliability
Foreign name
reliability
Definition
Consistency and stability of test results
Applied discipline
psychology

Reliability definition

Reliability refers to the degree to which the results obtained when the same method is repeatedly measured on the same object are consistent. On the other hand, reliability refers to the reliability of the measured data.
For example, for Question 1 of Part I of the Questionnaire on Library Utilization and Satisfaction, if the same person is 3 days apart, ask the same question
If the question is answered for the first time, the respondent chooses A, the second answer chooses C, and the third answer chooses D, it means that the reliability of the survey results for this question is low, because the survey results have a large difference. If you choose the same answer or the answer with less difference three times, the reliability of the survey results will be higher if the system error is excluded.
[1]

Reliability formula

If used
Represents the real value,
Represents the deviation is the systematic error,
Represents the random error of the measurement,
Represents the measurement results, then:
In the above decomposition formula,
Is an abstract variable that is a potential value that needs to be estimated. system error
Is the error that can be avoided or reduced by using certain means, while the random error
It is inevitable.
If the measurement results
With real value
Consistent or small differences, the measurement is said to be "reliable" or "trusted"; otherwise, the measurement is said to be "unreliable" or "untrusted" to some extent.
Due to system error
It is difficult to decompose, and it is required to avoid systematic errors in the design of the questionnaire, so usually only random errors are considered in reliability analysis. That is, based on the following formula, the reliability of the questionnaire is analyzed:
[1]

Reliability factor

Reliability Overview

For random errors
It is generally assumed that its expected value (average value) is 0 and is independent of the true value.
due to
versus
Independent of each other, so the following formula:
Variance of visible measurements
Variance equal to real value
Variance with error
Sum, and
versus
The relative size of can be used to describe the credibility of the survey results.
The larger the measurement, the larger the random error of the measurement, and the lower the reliability of the measurement. Reliability factor
To represent the size of the reliability.

Reliability definition

Reliability factor
Defined as: the variance of the true value
Variance in measurements
Proportion of
Or defined as:
The larger the value, the greater the credibility of the questionnaire.
[1]

Reliability disadvantage

From the above calculation formula of reliability, since the sum of squared errors is involved, it is necessary to repeatedly measure
,
Based on the value, the reliability estimation can be obtained. However, multiple retests will bring about memory effects and connection effects, and will cause resentment by the respondent, so it is more difficult to realize in survey research.
[1]

Estimation of reliability

Before estimating the reliability of the questionnaire, various subjective or objective alternative answers in the questionnaire need to be converted into a digital form by using an appropriate scale (such as Likert scale), and then the questionnaire is scored on this basis (Including individual scores, grouping scores and total scores of related topics, etc.).
Common specific methods of reliability analysis include retest reliability, replica reliability, split-half reliability,
There are four reliability coefficient methods (the latter two can be classified as internal consistency reliability).

Retest reliability

Retest reliability (test-retest reliability) is also called retest reliability. It is the same questionnaire for the same group of respondents, and they are investigated twice at different time points. . Retest reliability reflects the effect of random errors.
The source of the error examined in the retest reliability is the random effect of time. When evaluating the retest reliability, attention must be paid to the time of the retest interval. For personality tests, retesting intervals are appropriate between two weeks and six months.
In the evaluation of retest reliability, the following two important issues should also be paid attention to: Retest reliability generally only reflects changes caused by random factors, and does not reflect long-term changes in subject behavior. Different behaviors are affected differently by random errors.
Disadvantages: There is a dilemma in retesting reliability. Shortening the interval between two tests makes it easier for the test subject to recall the test subject; while extending the interval between two tests makes the test subject more likely to change due to external influences. [2]

Replica reliability

Parallel-forms Reliability, also known as equivalence coefficient, is a type of Equivalence Reliability, which refers to the degree of variation of the questionnaire results compared to another very similar The same group of people surveyed using two questionnaires with equivalent content but different topics, and then compared the degree of correlation between the two groups of data.
It has a greater workload than weight measurement reliability, because the same measurement tool (questionnaire, psychological scale, etc.) needs to build two equivalent copies, and the two copies must contain the same number, type, content, and difficulty. topic. To evaluate the reliability of replicas, two replicates are used to test the same group of subjects, and then the correlation coefficient between the two replicate measurement scores is estimated. The larger the correlation coefficient, the smaller the variation caused by the composition of the two replicates. This is different from the time-dependent variation in the reliability of the retesting, that is, the correlation coefficient reflects the degree of equivalence of the measurement score, so the replica reliability is also called the equivalence reliability.
The main advantages of replica reliability are: it can avoid some problems of retesting reliability, such as memory effect, practice effect, etc .; it is suitable for long-term follow-up research or investigating the impact of certain interference variables on test results; reduced coaching Or the possibility of cheating. The limitations of replica reliability are: If the behavior of the measurement is susceptible to practice, the reliability of the replica can only be reduced and this effect cannot be eliminated; The nature of some tests may change due to repetition; some It is difficult to find a suitable copy for the quiz. [2]

Internal consistency reliability

The main reflection is between the internal questions of the test
Reliability
Relationships, examining whether the same content or traits are measured on each item of the test. Internal consistency reliability is divided into sub-reliability and homogeneity reliability.
1. Split-half reliability refers to the degree of variation in the results of the two halves of the questionnaire in a survey. The reliability coefficient is obtained by dividing the test into two halves and calculating the correlation between the two halves. The longer the test, the higher the reliability coefficient. The correction formula is the Spearman-Brown formula. The Spearman-Brown formula is an empirical formula for correcting half-reliability (rtt = 2rhh / (1 + rhh). Generally, if the test is divided into two halves, the half-coefficient rtt is 0.5)
It assumes that the halves of the test scores have the same number of variations. When the assumption is not true, one of the Flanagan formula or the Kulon formula can be used to directly obtain the reliability coefficient of the test.
2. Homogeneity reliability refers to the extent to which each topic in the test examines the same content. When the homogeneity reliability is low, even if each test question seems to measure the same trait, the test is actually heterogeneous, that is, the test measures more than one trait. Homogeneity analysis is similar to internal consistency analysis in project analysis. Several formulas to calculate the homogeneity reliability: (1) Kud-Richardson formula (2) Klumbach coefficient. For some complex and heterogeneous psychological variables, it is not feasible to use a single test of homogeneity, so several relatively heterogeneous subtests are often used.

Reliability score

Refers to the consistency when different raters rate the same object. The simplest estimation method is to randomly select several answer sheets, score them by two independent raters, and then find the correlation coefficient between the two evaluation scores of each answer sheet. This correlation coefficient can be calculated using the product difference correlation method or the Spearman rank correlation method. [2]

The relationship between reliability and validity

There are obvious differences between the reliability and validity of the questionnaire, but there are also interrelated and mutually restricted relationships. Reliability mainly answers questions about the consistency, stability, and reliability of measurement results; validity mainly answers questions about the validity and correctness of measurement results.
The relationship between validity and reliability can be understood by the formula of the measured value O = TSR. If the measurement is completely valid, that is, 0 = T, S = 0, R = 0, the measurement must be completely reliable at this time. If the reliability of the scale is insufficient, it may not be completely effective because O = TR . If the scale is completely credible, it can be completely effective, or it may not be reached, because there may be errors. Although the lack of reliability necessarily lacks validity, the size of the reliability cannot reflect the size of the validity. Reliability is a necessary condition for validity, but it is not a sufficient condition. From a theoretical perspective, the quantity should have sufficient validity and reliability; from a practical point of view, a good scale should also be practical. Practicality refers to the economy, convenience, and interpretability of the scale.
In general, reliability is a necessary condition for validity, that is, validity must be based on reliability; but there is no measurement of validity, even if its reliability is high, such a measurement is meaningless of. The relationship between reliability and validity has the following types:
Credible and effective
This questionnaire accurately reflects the true attitude of the surveyed people. The questions in the questionnaire are closely related to the survey objectives. This situation is shown in Figure 8-8 (a). The solid points at (x, y) in the figure represent the real situation of the phenomenon to be measured, and the remaining points represent the measurement results obtained through investigation. If the survey results can truly reflect the surveyed objects and the measurement error is small, it means that the results of the questionnaire survey are credible and effective.
Trustworthy but invalid
Although the results of this questionnaire survey can accurately reflect the true attitudes of the people surveyed, the questions in the questionnaire are weakly related to the real purpose of the survey and are not consistent with the goals of the survey. As shown in Figure 8-8 (b). This situation shows that although the results obtained in the survey are credible, errors may have been made in some aspects. For example, the design of the questionnaire in the questionnaire caused all the surveyees to have a deviation in understanding, which resulted in a systematic Sexual deviation.
Untrustworthy and invalid
In this case, the results of the statistical survey are scattered, and it is difficult to obtain valid results from the questionnaire. This is the type that should be avoided in the measurement. As shown in Figure 8-8 (c).

IN OTHER LANGUAGES

Was this article helpful? Thanks for the feedback Thanks for the feedback

How can we help? How can we help?