How was the Human Genome Sequenced?

The human genome, also known as the human genome, refers to the human genome and consists of 23 pairs of chromosomes, including 22 pairs of autosomes and 1 pair of sex chromosomes. The human genome contains approximately 3.16 billion DNA base pairs. The base pairs are two nitrogen-containing bases bonded by hydrogen bonds, including thymine (T), adenine (A), cytosine (C) and guanine. (G) Four bases are arranged into a base sequence, where A and T are connected by two hydrogen bonds, and G and C are connected by three hydrogen bonds. The arrangement of base pairs in DNA can only be A versus T, G versus C. Some of these base pairs make up about 20,000 to 25,000 genes.

When one or more genes behave abnormally, they may cause some symptoms of a corresponding phenotype. Causes of genetic abnormalities include
Comparative genomics research on mammalian genomes shows that compared with species that have differentiated about 200 million years ago, about 5% of humans have been retained in the human genome, including many genes With regulatory sequences. And humans share most of the same genes with most known vertebrates.
The chimpanzee genome is 98.77% similar to the human genome. On average, every standard protein-coding gene that belongs to humans is related only to those that belong to chimpanzees.
Human Genome Project
The Human Genome Project (HGP) was first proposed by American scientists in 1985 and officially launched in 1990. United States, United Kingdom, French Republic,
Number of patents on 24 chromosomes
Chromosome number
Number of genes
Number of patents
number 1
2769
504
number 2
1776
330
number 3
1445
307
No 4
1023
215
Number 5
1261
254
number 6
1401
225
Number 7
1410
232
number 8
952
208
No.9
1086
233
No. 10
1042
170
number 11
1626
312
12th
1347
252
no. 13
477
97
14th
821
155
15th
915
141
Number 16
1139
192
number 17
1471
313
18th
408
74
Number 19
1715
270
number 20
762
178
The 21st
357
66
number 22
106
657
X
1090
200
Y
144
14
No. 3, 1445 307 was determined by China and started in September 1999 and completed in less than one year. These include genes related to lung cancer, ovarian cancer, and nasopharyngeal cancer.
From 1981 to 1995, there were 1,175 patent licenses for DNA sequences worldwide. Early applications were mainly for genes with known functions. Later, Craig Vent, who originally belonged to the National Institutes of Health, applied 2,716 genes that had not yet been understood.
Professor Ruan Yijun, an expert of the "Thousand Talents Plan" of Huazhong Agricultural University, recently revealed how genes interact and influence, including the mechanism of remote interaction. This will help scientists to understand the principles of human genetic work and explore the genetic mechanism of related diseases. The results were published in Cell magazine.
This new discovery reveals that although genes in the human genome are far apart from each other, related genes can actually be organized in an orderly manner through long-distance chromosome interactions and a highly ordered chromosomal framework. This indicates that there is a topological mechanism similar to cell manipulation systems in human cells that can help transcriptional regulation, and this topological regulation mechanism also helps to analyze genetic elements in human genes.
This research by Dr. Ruan Yijun solves the fundamental problem of how to communicate between genes and between human genome elements that turn genes on or off. They used a DNA mapping technology called ChIA-PET to reveal three-dimensionally how genes in the human genome interact with each other and activate genes at the right time. I think this result will soon enter the textbooks from the basic scientific literature to help students better understand the principles of the human genome. ChIA-PET technology, as a 'telescope' for human genome exploration, will also become an innovative and important molecular analysis tool.
1.Human genome sequencing
From 1990 to 1998, the human genome sequence has been completed and is being sequenced for a total of about 330Mb, accounting for about 11% of the human genome; about 200 genes related to human diseases have been identified. In addition, sequencing of the entire genome of 17 organisms including bacteria, archaea, mycoplasma and yeast has been completed.
It is worth mentioning that the cooperation between enterprises and research departments will greatly facilitate the completion of sequencing. The Institute of Genome Research (TIGR) in the United States and PE (Perkin-Elmar) co-founded a new company, investing $ 200 million in three years, and expect to complete the full sequence measurement in 2002. This progress will be three years ahead of the US government-sponsored HGP target. United States
Formation of Life Science Industry
Because genomic research is closely related to industrial sectors such as pharmaceuticals, biotechnology, agriculture, food, chemistry, cosmetics, environment, energy, and computers, more importantly, genomic research can be transformed into huge productivity. A number of large international pharmaceutical companies and On a large scale, chemical industry companies have invested heavily in the field of genome research, forming a new industry sector, the life sciences industry.
Some of the world's largest pharmaceutical groups have invested in the establishment of genomic research institutes. Ciba-Geigy and Ssandoz jointly established Novartis, and spent $ 250 million to establish an institute to carry out genomic research. Smith Kline spent $ 125 million to speed up sequencing, building 25% of drug development projects on genomics. Glaxo-Wellcome doubles researchers by investing $ 47 million in genomic research.
Large chemical industry companies transition to the life sciences industry.
The "Human Genome Project" was proposed by American scientist and Nobel Prize winner Dalbeco. Its goal is to determine the genetic map, physical map and DNA sequence of 23 human chromosomes, in other words, to detect 23 pairs of chromosomes in human cells The entire sequence of 3 billion bases (or nucleotides) has clearly positioned a total of about 100,000 genes on the chromosome, deciphering all human genetic information. In 1990, the US Congress approved the "Human Genome Project," and the federal government allocated $ 3 billion to launch the plan. Subsequently, the United Kingdom, Japan, France, Germany, and China joined in succession. The significance of this plan is comparable to the conquest of the universe, and is called the "lunar plan" of the life sciences.
There are 23 pairs of 46 chromosomes in human cells.
It has been more than 7 years since HGP was officially launched in October 1990. The achievements of these 7 years have made people no longer skeptical of the feasibility of HGP as in the late 1980s, as stated by Francis Collins, the head of HGP in the United States Yes, the most important lesson we have learned from the Human Genome Project is that it is perfectly possible. And since the implementation of HGP, it has been found that the original schedule has been completed ahead of time under the condition that the funds have not reached the original funding strength. HGP mainly includes four tasks: establishment of genetic map; establishment of physical map; DNA sequence determination; gene identification. Specifically, in the past few years, progress has been made in the following four areas:

Human genome genetic map

The genetic map determines the relative distance between computer-linked genetic markers by the frequency of recombination. By the end of 1994, with the joint efforts of French and American scientists, genetic maps with RFLP markers and microsatellite DNA that can be analyzed in batches by PCR were completed, covering 5826 sites, covering 400cM, and a resolution of 0.7cM. Making. In March 1996, French scientists also reported a genetic linkage map constructed entirely for microsatellite markers, including 2335 loci with a resolution of 1.6cM. These works completed the plan with a resolution of 2 to 5 cM, which was originally scheduled to be completed in 1998. It not only provides an important basis for further physical map construction, but also can apply this genetic map to those who have Multi-gene diseases with complex traits (such as hypertension, diabetes, coronary heart disease, etc.) are linked to complete the mapping of susceptible genes involved in these diseases.

Physical map of human genome

The physical map is used to determine the physical distance between genetic markers. Its production is mainly based on sequencing and distance measurement of markers using large-segment DNA manipulation techniques, which lays the foundation for gene isolation, identification, and genomic DNA sequencing. The construction of physical maps has also made great progress in recent years: a physical map with a resolution of 199 kb marked by 15086 sequential tag sites has been established, and a coverage of 225 YAC continuous clone contigs has been established. A physical map of 75% of the entire human genome. In addition, the use of radiation hybrid mapping technology to produce physical maps is also underway.

DNA Human genomic DNA sequencing

The determination of the entire DNA sequence of the human genome is a core part of HGP, and this aspect has also experienced extraordinary rapid development in the past few years. Now that genetic and physical mapping work has been and is about to be completed, sequencing has become the top priority for the next 10 years. When the genome project was launched, the longest completed DNA sequence was a 250kb cytomegalovirus sequence, which took years. Today, a large sequencing center can complete the sequencing of a bacterial genome (greater than 1Mb) in a month. So far, three research groups including L. Hood, B. Booe, and Sanger Center have completed. It plays a decisive role in determining the complete nucleotide sequence of the human genome. The current method needs to be further improved and even revolutionized. It is expected that all human genome sequencing work will be completed by 2005.

Identification of human genome genes

One of the important contents of HGP is to identify the functional units of transcription and expression in all human genes, that is, the genome, and study its structure. Two strategies are currently used: (1) identify those transcription and expression sequences (genes) from the genomic DNA sequence; (2) randomly select clones from the cDNA library and perform partial sequencing. These randomly detected partial cDNA sequences are called expression sequence tags (EST). The map drawn according to the position and distance of the transcription sequence is the transcription map. In the past few years, the causative genes of many important diseases (such as Fragile X Syndrome, Huntigton's disease, Wilson's disease, and polycystic kidney disease) have been cloned by location cloning technology. With the improvement of accuracy, the localization cloning technology will be gradually replaced by the localization candidate cloning method.

IN OTHER LANGUAGES

Was this article helpful? Thanks for the feedback Thanks for the feedback

How can we help? How can we help?