What is Gene Expression Profiling?
Gene expression series analysis (SAGE) is to quickly and detailed analysis of thousands of EST (express sequenced tags) to find SAGE tag sequences with different expression richness.
Serial analysis of gene expression
- In 1995, Velculescu and others proposed the Serial Analysis of Gene Expression (SAGE) technology, which can study thousands of transcripts simultaneously.
- SAGE is a technology for rapid analysis of gene expression information.It quickly and detailedly analyzes thousands of expressed sequence tags (ESTs) to find SAGE tag sequences with different expression abundances, thereby obtaining nearly complete Genomic expression information. SAGE technology and gene chip are the two most common methods of gene expression profiling. With the development of the third generation sequencing technology, by constructing a cDNA library, and then using the high-throughput advantages of the second generation sequencing technology to sequence the mRNA library, and then the method of gene expression profiling has become more and more important in gene expression profiling The more important the position.
- In this method, very short cDNA (10-14bp) tags can be generated by restriction enzyme digestion, amplified by PCR and ligated, and then the linker is sequenced. SAGE greatly simplifies and speeds up the collection and sequencing of 3'-expressed sequence tags. Like DD, SAGE is an "open" system that can discover new unknown sequences. Before comparing specimens, SAGE requires more steps in the generation and processing of cDNA. Because SAGE is a gene sequencing method that relies on DNA sequencing, it measures gene expression more quantitatively than DD. Due to the large number of sequencing reactions required, the cost factor is a major limitation for most research institutions for its widespread use.
- First, a 9 to 10 base short nucleotide sequence tag contains enough information to uniquely identify a transcript. For example, a 9-base sequence can distinguish 262,144 different transcripts (49), while the human genome is estimated to encode only 80,000 transcripts, so in theory each 9-base tag can represent the characteristic sequence of a transcript.
- Second, if 9-base tags can be concentrated in one clone and sequenced, and the resulting short sequence of nucleotide sequences can be input into a computer for processing as continuous data, thousands of mRNAs can be transcribed. Analysis.
- (1) Biotinylated oligo (dT) was used as a primer for reverse transcription to synthesize cDNA.
- SAGE is a fast and effective gene expression research technology. Any laboratory with PCR and manual sequencing equipment can use this technology. Combined with automatic sequencing technology, it can analyze 1,000 transcripts in 3 hours. In addition, the use of different anchoring enzymes (recognizing 5-20 base class II endonucleases) makes this technology more flexible.
- First, SAGE can be applied to human genome research. In 1995, Velculescu and others selected Bsm FI and Nia as the tagging enzyme and the anchoring enzyme, respectively, using a computer to analyze the 9 base tag data and search the GenBank. Of the 1,000 tags analyzed, more than 95% of the tags were able to represent unique transcripts. Transcription levels are divided into 4 categories according to the frequency of tags: more than three times, a total of 380, accounting for 45.2%; three times, a total of 45, accounting for 5.4%; two times, a total of 351, accounting for 7.6%; A total of 840, accounting for 41.8%. Therefore, SAGE can quickly and fully extract the gene expression information of organisms, and quantify the known genes. SAGE can also be used to find new genes. Although the SAGE tag includes only 9 bases, a total of 13 bases can be confirmed by adding the anchor enzyme site sequence (4 bases). If a tag searches for a known sequence without a homologous sequence, the 13 base fragment can be used as a probe to screen a cDNA library to obtain a cDNA clone.
- Secondly, SAGE can be used to quantitatively compare the specific gene expression of tissue cells in different states. Stephen L et al. (1997) compared the gene expression of mouse embryo sac fibroblasts using the SAGE technique. Mouse embryo sac fibroblasts can produce temperature-sensitive P53 tumor suppressor protein. SAGE analysis can be used to compare the differences in gene expression at two different temperatures. From about 15,000 genes analyzed, the expression of 14 genes was found to be dependent on the P53 protein, and the expression of 3 genes was significantly related to the inactivation of the P53 protein. Zhang et al. (1997) compared 300,000 transcripts of gene expression in normal cells and tumor cells and found that among the 4,500 transcripts analyzed, at least 500 had significant differences in expression in the two cell tissues.
- Third, because SAGE can simultaneously collect maximum gene expression information of a genome, the analysis data of transcripts can be used to construct a chromosomal expression map. Victor et al. Analyzed the gene expression of the yeast genome, and found 4,655 genes from 60,633 transcripts (expression levels ranging from 0.3 to 2.0 / cell), of which 1981 genes have been confirmed for function and 2684 have not been reported . The chromosome expression map drawn by the fusion of gene expression information and genomic map is used to link gene expression with physical structure, which is more conducive to the study of gene expression patterns. (Velculescu, 1997) SAGE is an effective tool for the qualitative and quantitative study of gene expression, which is very suitable for comparing biological gene expression in different developmental or disease states.
- In addition, SAGE can obtain genome expression information in a near-complete manner, and can directly read out gene expression information of any type of cell or tissue. The application of SAGE technology will greatly accelerate the progress of genomic research, but it must be fused with and complemented with other technologies in order to conduct the most comprehensive research on genomic genes.