What Is Normal Probability Distribution?
The normal distribution, also known as the "normal distribution", also known as the Gaussian distribution, was first obtained by A. Anthony Mouff in the asymptotic formula for the binomial distribution. CF Gauss derived it from another angle when studying measurement errors. PS Laplace and Gauss studied its properties. It is a probability distribution that is very important in the fields of mathematics, physics, and engineering, and has a significant influence on many aspects of statistics.
The concept of normal distribution was first proposed by German mathematician and astronomer Moivre in 1733, but since German mathematician Gauss first applied it to astronomy research, the normal distribution is also called Gaussian distribution. His influence was so great that he gave the normal distribution the name of "Gaussian distribution". The reason why later generations attributed the least square method to him is also out of this work. But now the German 10 mark banknotes with Gaussian headprints are printed with normal distribution.
Due to the general normal population, its image is not necessarily about y
Some properties of the normal distribution: [2]
Concept and characteristics:
First, the concept of normal distribution
The histogram drawn from the frequency table data of the general distribution, as shown in Figure , shows that the peak is located in the middle, and the left and right sides are roughly symmetrical. we
Overview of Normal Distribution
1. Estimating the frequency distribution A variable that obeys the normal distribution can estimate the frequency ratio in any range according to the formula as long as it knows its mean and standard deviation. [3]
2. Formulate the reference value range
(1) The normal distribution method is applicable to indicators that obey normal (or nearly normal) distributions and indicators that can obey normal distributions after conversion.
(2) The percentile method is often used for indicators of skewed distribution. The single and double side cutoffs of both methods in Table 3-1 should be proficient.
3. Quality control: In order to control the measurement (or experiment) error in the experiment, it is often used as the upper and lower warning value and as the upper and lower control value. The basis for this is: Under normal circumstances, the measurement (or experiment) error follows a normal distribution.
/ 4. Normal distribution is the theoretical basis of many statistical methods. Various statistical methods, such as test, analysis of variance, correlation, and regression analysis, require that the analyzed indicators follow a normal distribution. Although many statistical methods do not require that the analysis indicators follow a normal distribution, the corresponding statistics approximate a normal distribution at large samples, so these statistical inference methods at large samples are also based on the normal distribution.
Frequency distribution
Example 1.10 In 1993, a sample survey of the height (cm) of 100 18-year-old male college students in a certain place was conducted. The mean value = 172.70cm and the standard deviation s = 4.01cm. The percentage of the total number of 18-year-old male college students; Find the actual percentage of the total number of 18-year-old male college students in the range of X + -1s, X + -1.96s, and X + -2.58s, and compare it with the theoretical percentage.
In this example, and are unknown but the sample content n is large. According to formula (3.1), the sample mean X and the standard deviation S are used to replace and , respectively, and the u value is obtained, u = (168-172.70) /4.01 =- 1.17. Check the area under the standard normal curve of the attached table, find -1.1 on the left side of the table, and 0.07 on the top of the table. The intersection of the two is 0.1210 = 12.10%. 18-year-old male college students in the area are below 168cm, accounting for 12.10% of the total. Other calculation results are shown in Table 3.
Table 3 Actual distribution and theoretical distribution of height of 100 18-year-old male college students
distributed x + -s | Height range (cm) | Actual distribution People | Actual distribution percentage(%) | Theoretical distribution (%) |
X + -1s | 168.69 176.71 | 67 | 67.00 | 68.27 |
X + -1.96s | 164.84 180.56 | 95 | 95.00 | 95.00 |
X + -2.58s | 162.35 183.05 | 99 | 99.00 | 99.00 |
Research on Normal Distribution Comprehensive Quality
The statistical law of education statistics shows that students' intelligence levels, including their learning abilities, and practical abilities, are normally distributed. Therefore, the normal test score distribution should basically follow the normal distribution. Examination analysis requires drawing a histogram of the student's performance distribution, and using "high in the middle and low at both ends" to measure the extent to which the results conform to the normal distribution. The evaluation criteria are that the histogram of the candidate's performance distribution is basically a positive curve, which is good. If it is slightly positive (negative), it is medium. If it is severely skewed or irregular, it is poor.
From the perspective of probability statistics, it is correct that "the normal test score distribution should basically follow the normal distribution". However, it is necessary to consider that the nature of people and things is different, and that education can make "random" interfere, and the shape of the curve or histogram to evaluate the test results is biased. Many education experts (such as Gu Lingying in Shanghai, Broome in the United States, etc.) have already demonstrated through practice that education can make a lot of achievements, most students can pass, and most students can get high scores, and the test score curve is skewed normally . However, it has been affected for a long time by the "middle high, low at both ends" standard, which limits the teachers' actions and inhibits the confidence that most students can learn well. This is a big misunderstanding. Normally the normal curve has an axis of symmetry. When the number of candidates for a certain score (or score segment) is the highest, the highest point of the corresponding curve is the vertex of the curve. The line segment connecting the corresponding point of the score value on the horizontal axis and the vertex is the axis of symmetry of the normal curve. The highest number of candidates is the peak. We noticed that the performance curve or histogram is actually rarely symmetrical, and it is more appropriate to call it a peak line.
Normally distributed medical reference value
Some medical phenomena, such as the height of the qualitative population, the number of red blood cells, the amount of hemoglobin, and the random errors in the experiment, appear as normal or near normal distributions; although some indicators (variables) follow a skewed distribution, but after data conversion The new variables can obey the normal or near normal distribution, and can be processed according to the law of normal distribution. An index that obeys the normal distribution after logarithmic transformation is called obeying the lognormal distribution.
The range of medical reference values is also called the range of medical normal values. It refers to the fluctuation range of the so-called "normal human" indicators such as anatomy, physiology, and biochemistry. When formulating the range of normal values, we must first determine a group of "normal people" with sufficient sample content. The so-called "normal people" does not mean "healthy people", but refers to the exclusion of diseases and related factors that affect the homogeneity of the indicators studied. Population; Secondly, appropriate percentage cutoff values should be selected according to the research purpose and use requirements, such as 80%, 90%, 95%, and 99%, and commonly used 95%; single-sided or double-sided cutoff values are determined according to the actual use of the indicator, such as If the white blood cell count is too high or too low, it is abnormal to determine the bilateral boundary value. If the liver function is too high, it is abnormal to determine the unilateral upper boundary, and if the lung capacity is abnormal, it is necessary to determine the unilateral lower boundary. In addition, according to the distribution characteristics of the data, an appropriate calculation method should be selected. Common methods are:
(1) Normal distribution method: It is applicable to the data of normal or near normal distribution.
Two-sided boundary value: X + -u (u) S One- sided upper boundary: X + u (u) S , or one-sided lower boundary: Xu (u) S
(2) Log-normal distribution method: applicable to log-normal distribution data.
Double-sided bounds: lg-1 [X (lgx) +-u (u) S (lgx)]; single-sided upper bounds: lg-1 [X (lgx) + u (u) S (lgx)], or One-sided lower bound: lg-1 [X (lgx) -u (u) S (lgx)].
Common u values can be found from Table 4 according to requirements.
(3) Percentile method: It is often used for skewed distribution data and data with no exact value at one or both ends of the data.
Two-sided bounds: P 2.5 and P 97.5; one-sided upper bound: P 95, or one-sided lower bound: P 5.
Table 4 Table of commonly used u values
Reference value range (%) | Unilateral | Bilateral |
80 | 0.842 | 1.282 |
90 | 1.282 | 1.645 |
95 | 1.645 | 1.960 |
99 | 2.326 | 2.576 |
The theoretical basis of statistics:
For example, the t-distribution, F-distribution, and distribution are all derived based on the normal distribution, and the u test is also based on the normal distribution. In addition, the limits of the t-distribution, binomial distribution, and Poisson distribution are normal distributions. Under certain conditions, they can be processed according to the principle of normal distribution.
The most important distribution in probability theory
The normal distribution has an extremely wide practical background. The probability distributions of many random variables in production and scientific experiments can be approximately described by the normal distribution. For example, under the same production conditions, the product's strength, compressive strength, caliber, length and other indicators; the same organism's length, weight and other indicators; the same seed weight; the error of measuring the same object; the impact point edge Deviation in a certain direction; annual precipitation in an area; and the velocity component of an ideal gas molecule, etc. In general, if a quantity is the result of many small independent random factors, then this quantity can be considered to have a normal distribution (see Central Limit Theorem). In theory, the normal distribution has many good properties, and many probability distributions can be approximated by it; there are also some commonly used probability distributions derived directly from it, such as lognormal distribution, t distribution, F distribution, etc. .
Main connotation
In the practical context of linking nature, society, and thinking, we are based on the nature of normal distribution, and are characterized by normal distribution curves and area distribution diagrams (these diagrams will appear when we talk about normal distribution and normal distribution theory) ), Abstraction and promotion, grasp the main philosophical connotation, and summarize the main connotation of normal distribution theory (normal philosophy) as follows:
Holism
The normal distribution reveals that we need to look at things from a holistic perspective. "The overall concept or the overall concept of the system is the essence of the system concept." The normal distribution curve and area distribution chart are composed of three regions: the base region, the negative region, and the positive region. The proportions of each region are different. Looking at things as a whole, we can clearly see the original appearance of things, and we can draw the fundamental characteristics of things. You can't just see the trees but not the forest, nor can you be partial. In addition, the whole is greater than the sum of the parts. On the basis of analyzing each part and each level, we must also look at things from the whole, because the whole has different characteristics from each part. Looking at the world from a holistic perspective, we need to base ourselves on the base area and focus on the negative and positive areas. We need to see the main aspects, but also the minor aspects. We must see both the positive side and the negative side of things, and the backward side of things. Seeing things one-sidedly must see things that are skewed or perverted, not the real things themselves.
Focus theory
The normal distribution curve and area distribution chart clearly show the key points, that is, the base area occupies 68.27%, which is the main body, and we must focus on it. In addition, 95% and 99% show the normal comprehensiveness. To understand the world and transform the world, we must stay focused, because the focus is the main contradiction of things, and it plays a major and dominant role in the development of things. Only by grasping the key points can we outline everything. Things and phenomena are numerous and complex. If you do not grasp the main contradictions in all kinds of situations, you will fall into infinite triviality. Due to the relative limitations of our time and energy, we should focus more on the pursuit of efficiency. In the normal distribution, the base area occupies the main body and focus. If we combine the 20/80 rule, we can boldly take the positive zone as the focus.
Development theory
Connection and development are the basic laws of the development and change of things. Everything has its own history of generation, development and extinction. If we consider the normal distribution as the development process of any system or thing, we obviously see that this process goes from negative to basic and then positive. Zone process. Both natural, social and human minds clearly follow this process. Accurately grasping the historical process and stage in which things or events are located greatly helps us to grasp the characteristics and nature of things and events. It is an important basis and basis for us to analyze problems, take countermeasures and solve problems. The stages of development are different, and the nature and characteristics are different. The method of analyzing and solving problems must be adapted to this. This is the specific analysis of specific problems, and the essence of emancipating the mind, seeking truth from facts, and advancing with the times. The characteristics of normal development also enlighten us. Most things are gradual and cumulative. Taking the path of gradual development is the normal state of development. For example, heredity is normal and mutation is abnormal.
In short, normal distribution theory is a scientific worldview and scientific methodology. It is one of the most important and fundamental tools for us to understand and transform the world. It has important guiding significance for our theory and practice. Understanding the world with normal philosophy can better understand and grasp the nature and laws of the world, transforming the world with normal philosophy, and can better respect and use objective laws to transform the world more effectively.
Francis Galton [1822.02.16-1911.01.17], a British explorer, eugenicist, psychologist, father of difference psychology, and the founder of physiometry in psychometrics.
Gordon's contribution to psychology can be summarized in three aspects: undifferentiated psychology, quantitative psychometric measurement, and experimental psychology:
The quantification of psychological research began with Galton. He invented many sensory and motor tests, and represented the differences in measured psychological traits in numbers. He believes that all human traits, whether material or spiritual, can be quantified in the end. This is a necessary condition for the realization of human science. Therefore, he first applied statistical methods to deal with psychological research data. number. He collected a lot of data to prove that the distribution of human psychological characteristics in the population is as normal as the height and weight of the distribution curve . When he discussed the influence of genetics on individual differences, he gave a preliminary hint for the concept of correlation coefficient. For example, if he studies the relationship between "intermediate" and the height of his adult children, he finds that there is a positive correlation between the height of the intermediary and his children, that is, the parents' stature is higher, and their children's stature also has a higher tendency. Conversely, parents have lower stature and their children tend to be shorter. At the same time, it is found that the height of children is often slightly different from their parents, and there is a trend of "back to middle", that is, leaving the height of their parents and returning to the average of the average height.
Intelligence, ability
Richard Hernstein [(Richard J. Herrnstein 1930.05.20-1994.09.13), American Comparative Psychologist] and Charles Murray are famous for co-authoring the book "The Normal Curve", in which They point out that people's intelligence is normally distributed . Intelligence is mainly inherited and varies by race. Jews and East Asians have the highest IQs, followed by whites, and the worst performers are blacks and Hispanics. They reviewed decades of research in psychometrics and policy, and found that American society has neglected the trend of increasing influence of IQ. They tried to prove that the current social policies in the United States, which are biased towards the low-income group mainly dominated by African and South Americans, such as vocational training and university education, are a waste of resources. They used test results from recruits to prove that black young people had lower intelligence than white and yellow people; moreover, these people's intelligence had been stereotyped, and training them had little effect. Therefore, the government should abandon the education of these people and use the money for enlightenment education, including all races, because children's intelligence is not yet stereotyped and has great potential for development. Because this book deals with the intellectual problems of black people, it was besieged from all sides when it was published.