Plant Biotech 2 (2017)Apunte Inglés
Vista previa del texto
We have two different types of genes: Protein coding genes and RNA molecules coding genes.
The latter ones are involved in gene expression.
A gene is a fragments of DNA that starts in a Promoter and ends in the 3´UnTranslated Region (UTR). 5´3’.
The Promoter is the region where the transcription is initiated. So the Pre-mRNA has the 5’UTR and the 3’UTR regions. The exons will be processed and cleaved at the moment of splicing. Then, it’s time for the translation into amino acids. The poly-A tail of the mRNA is coded in the 3’UTR region. The coding sequence (cds or ORF) starts at the ATG in the 5’ ending.
Types of RNA molecules mRNA Codifies for proteins.
tRNA Involved in protein synthesis, they carry amino acids to the peptide synthesis machinery.
rRNA Present in the Ribosome. There are two types, each of them present in the Small and Big subunits respectively. 18S and 28S, 5’8S, 5S.
snRNa Involved in the mechanisms of modifications of other RNAs, some involved in the spliceosome.
snoRNA (small nucleolar) Involved in the Telomere synthesis, control the maturation of the rRNA and modify chemically some nucleotides of other RNAs MicroRNA Regulate gene expression. Involved in iRNA pathways.
Plant Biotech Lesson 2 Héctor Escribano Gene transcription Size distribution of total RNA mRNA in plants is only 2% of the total RNA molecules. The RNA molecules extracted from a plant, are very unstable. If we did an electrophoresis of that total RNA, we’d only see rRNA, as they are in very big quantities, ordered by size.
In addition, plants have chloroplasts that have their own genomes that code for their own rRNA.
So in green tissue, after an electrophoresis we’d see more bands corresponding to the rRNAs of the chloroplasts. rRNA in the nucleus are different in size to those of the Chloroplast. Also, rRNA from chloroplasts are more abundant than rRNA from the nucleus.
DNA Human genome is 3000 Mb long, and has 30.000 genes. That’s a 3% of coding regions. All of the other regions are involved in gene expression and large sequences with unknown functions. That used to be the JUNK DNA, but now it is known that JUNK DNA is very important.
Repetitive DNA sequences • • • Micro satellites that are from 1 to 5 bases motif repetitions.
Mini satellites made of 40 bases motif repetitions.
Tandemly repeated sequences have specific motifs and are characterized by their position in the chromosome.
Only some of them have a defined use: Telomeric sequences Protect de DNA at the endings of the chromosomes. That is needed because in the replications, the extremities of the DNA get shorter every time because of the mechanism used in that replication. The replication fork is unable to copy until the end, and there are some bases that are not copied. The telomeric sequences are there to protect the real coding DNA. Each replication, the Telomerase adds some repetitions to help fix the shortening. There’s a fixed motif: TTTAGGG. But the Telomerase is only active in some tissues, only those that have Mitosis activity.
Precursor of ribosomal genes Are responsible for the transcription of different types of rRNA.
Mobile sequences Transposons and Retroelements. Are sequences that can move their position in the chromosome or even between chromosomes.
Plant Biotech Lesson 2 Héctor Escribano Gene transcription Plants have a lot of genes One of the explanations to why the plants have more genes than humans, is that the synthesize much more molecules than us. They have a much more developed secondary metabolism. And as they cannot escape from predators and dangers, they have to have tools to fight those perils.
The second explanation is the duplication of their genomes. Along evolution, their genes have been duplicated and so they have more genes. After that duplication, there are some parts that have been eliminated. Though, there are some parts that are kept.
To identify genes and count them, nowadays, we use software. Because it is a very hard work and would take ages and a lot of resources to do it by hand.
ESTs. Expressed sequence Tags ESTs are derived randomly from cDNA libraries via sequencing analysis. These are fragments derived from the process of creating cDNA from mRNA. The polymerase might not finish synthesizing the complementary strand, and so, the cDNA would be shorter than the whole mRNA.
That short strand is an EST.
Expressed sequence because it is from mRNA. Tag, because it leads us to the original mRNA.
There are some ESTs that are not related to a known gene.
Then, you can use it to know new genes.
Depending on which extremity (3’ or 5’) is annealed the primer at the time of sequencing, we’ll have 3’-ESTs or 5’ESTs. If the primer anneals in the 3’ side (so it is a reverse primer) the polymerization will take place from 5’ to 3’. As the EST created is from the 3’ side, it is a 3’-EST.
As an average Sanger sequencing allows fragments of 700800 bp to be sequenced. That does not cover the whole mRNA.
These tags are useful to annotate genes and identify them (even genes that have not been identified in other species), discover new genes inexpensively, identify the organism or tissue where it comes from, evaluate gene expression in different situations, study gene structure, find variations between paralogous and orthologous genes and map their positions in the genome.
Also, they are useful to reconstruct genes. To study the gene structure, you could get some different ESTs and try to rebuild a cds by aligning them. That’s only possible when all ESTs are adjacent between them, are overlapped in the extremities and the cds is fully covered.
Plant Biotech Lesson 2 Héctor Escribano Gene transcription Imagine a certain region of the Rice genome. There are three genes, next to each other, that codify for the same enzyme. Their sequence is different between them but very similar. These genes are paralogous.
This enzyme is not only present in rice but also in maize. Maize has only one gene that codifies for that same enzyme. They have different sequences and the same function.
For three years, the ESTs databases have not been updated because a new technology has appeared. That technology enables the sequencing of RNA. Though, ESTs are still being used along with the next generation sequencing methodologies.
Transcription In basal transcription, a Core promoter is needed for the gene to be active. Usually, the Core promoter is a TATA box and/or an initiator.
The TATA box binding proteins (TBP) binds to the TATA box forming a complex with the TBP associated factors (TAFs). The complex composed of the TBP and TAFs is called TF2D (Transcription Factor 2 D) (2 comes from the polymerase used which is POL2 – D comes from the name of one of the factors, because there are many).
Polymerase 1 is used for rRNA (28S-18S-5,8S). Polymerase 2 is for mRNA and some MicroRNA. Polymerase 3 is used for tRNA, 5S rRNA and MicroRNA. Polymerase 4 and 5 are involved in siRNA.
The transcription starts at the initiator point the TSS. So the polymerase 2 binds to the TATA box through the TF2D complex and goes on until the Initiator is found. There, at the TSS, it’s where the transcription is started.
Promoter structure The promoter is usually located upstream the cds (in the 5’ end). Sometimes, the Promoter is not sticked to the cds, but further away. All the elements in the Promoter are CIS-acting elements. TATA box and Initiator are both CIS-acting elements. In general, the TATA box is surrounded by GC boxes, which improve the efficiency of the binding between the TATA box and the TF2D. TATA box is located at -35 bp from the TSS. Most times, the TSS (+1) is an Adenine.
The GC boxes can also be called BREu and BREd. TF2B recognition elements upstream and downstream, respectively.
There are also TRANS-acting elements, proteins or transcription factors.
Further from the Initiator there might be other CIS-acting elements. But that will depend on the species and the gene.
Plant Biotech Lesson 2 Héctor Escribano Gene transcription Some promoters don’t have a TATA box. If there is not a TATA box, there will be something else that has its same function. But of course, there are many different variations. This is only a model. There are some genes that have been identified but whose Promoter is still unknown.
Core promoters in plants are poor in G. And there are exclusive elements in Monocots and Dicots.
There are two types of transcription. The focused one and the dispersed. In the focused, there is only One strong TSS and is usually present in regulated genes. The dispersed has multiple weak TSS and is present in housekeeping genes that are always expressed. The TSS, in both cases, is upstream the cds.
Sometimes the promoter is not only the Core promoter, as there is also another Upstream promoter which has other CIS-acting elements such as: CCAAT boxes, enhancers, insulators, silencers, GC boxes, etc... The function of the upstream promoter is to regulate the Core promoter. Furthermore, there might be CIS-acting elements downstream the TSS. Both promoters are involved in the regulation of the Core promoter and are the ones that regulate the changes in the basal transcription.
RNA processing It’s the conversion of the Pre-mRNA into mRNA, the maturation process. There’s the Capping, the splicing and the Poly-A.
Capping Is the addition of a modified Guanosine (7-methyl Guanosine) to the 5’ end during its emergence of the RNAP2. This 7-methyl guanosine protects the RNA from RNAses and is also fundamental for ribosome recognition.
Splicing The Pre-mRNA is cleaved and the exons are removed.
Polyadenylation It is the termination of the translation but it also protects the 3’ end.
Transcriptional regulation There are different transcription factors that bind to the RNAPol2 affecting its affinity to the different promoters. A combination of different transcription factors makes possible a very sophisticated regulation of the expression of the genes. This kind of regulation permits spatial and temporary regulation; because a certain transcription factors may only be synthesized in a certain tissue and not in others.
There are different types of Cis-acting elements, such as: silencers, enhancers or insulators.
Plant Biotech Lesson 2 Héctor Escribano Gene transcription The enhancers are located upstream the Core promoter. A transcription factor of activating function binds to the Enhancer. That Activator, by the 3D structure of the DNA is located close to the Core promoter. And when the RNAPol2 binds to the promoter, the Activators strengthen the bind between RNAPol2 and DNA.
The silencers make that latter bind more difficult. A Repressor binds to the Silencer region and disturbs the binding of the RNAPol2.
There are two types of insulators. The enhancer-blocking one, that will lessen the transcription.
The Barrier insulator is located in between the barrier of Heterochromatin and Euchromatin (Enables gene expression). Prevents the Heterochromatin from going into the Euchromatin, so anything beyond that barrier is not expressed.
But there are also Trans-acting elements, that can work by phosphorylation, ligand binding or spatial/temporary regulation.
Untranslated regions These are sections of the DNA that are before de Start codon and after the Stop codon. These regions are important to protect the mRNA when it goes to the cytoplasm. Also can be involved in mRNA location. It may also control the transcription and translation processes and its efficiency.
We are used to draw the mRNA as linear. But it is not. There are regions in the 3’-UTR and 5’UTR that form secondary structures. These secondary structures are protein binding sites. These proteins may have a protecting function, or other functions.
The mRNA molecules start its synthesis in the 5’-UTR end, so this is the end that has a regulatory function. There is also another secondary structure in the 5’-UTR called IRES (internal ribosome entry site) which is the region where the ribosome binds to start translation.
In the 3’-UTR region there is the Polyadenylation signal and the Zip code (a signal for mRNA transportation through the cell – not all the mRNA have this). There are also binding sites for other proteins that may affect the mRNA stability or location. And binding sites for miRNAs used in gene silencing.
Fundamental notions for genetic engineering To clone a gene and make it be expressed in another organism we need an expression vector.
The Expression Vector always needs a promoter, the coding sequence and the Termination region.