Tema 4 Genòmica (2017)Apunte Inglés
Inclou els apunts del tema 4 amb les il·lustracions pertinents corresponents a l'assignatura de Genòmica. REPETITIVE DNA. TRANSPOSONS: NON CODING DNA TRANSPONS.
Vista previa del texto
Natalia Mingorance García
3r Biologia – UdG
TEMA 4: REPETITIVE DNA. TRANSPOSONS: NON CODING DNA TRANSPONS
Noncoding DNA (approximately the 98% of the human genome).
What is out of the genes The DNA sequences that doesn’t codify for protein coding genes. As you can see here, almost all our genome is non-coding DNA. This is related to the c-value paradox. C-value is the amount of DNA that we have in our cells. In the image we can see that out cells have the largest amount of non-coding DNA.
We can classify it in two types: - Conserved non-coding sequence (CNS) Some kind of selective pressure act in these sequences. Here we have: o Introns: Nuclear protein-coding Nuclear and archaeal transfer RNA (splicing) Introns I, II, III o Cis- and Trans-regulatory elements. Promoters o 3’. 5’ UTRs o UCR 200 bp o Pseudogenes o Telomeres o Repeat elements (No CNSs) Tandem repeats (one element repeated after the other, the same element repeated).
Satellite DNA, Minisatellite, Microsatellite o Interspersed repeats. Transposable elements. (Repeated elements but not one after the other, it can be located in other chromosome) SINES, short Interspersed Elements 1 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA LINES, Long … Repeat sequences - Variable number of tandem repeats (VNTR) Single sequence repeats (SSR) Microsatellites Slipped strand mispairing One kind of tandem repeat is called satellite and it’s the main sources of generate variation.
VNTR (variation tandem repeat) variation in a number of tandem repeats. It’s extremely easy to add or delete these repetitions in DNA sequences. There is no selection pressure or the selection pressure is weak usually.
Based on the length of the repetition we have these 3 types of tandem repeats: - - - Satellite DNA: is usually located in the centromere. The length of the repetition is 171 pair of bases but the complete length of the different repetitions can expand until 1Mb.
Minisatellite: one of the minisatellites is in the telomere. The length of each minisatellite (the motive) is 6 – 25 pb and the complete length of all of them together (tandem repeat), of this amount of repetitions (the complete set) is 20 Kb.
Microsatellites: very short sequences but repeated also in tandem. The function of these microsatellites is unknown and probably there’s no function. Mutation, but no mutation in the DNA sequences. Mutation in the number of repetitions, of tandem repeats. They are the clearly smallest one and it’s easier to mutate it. One of the highest variation sequences in our genomes. From these one “codis” it’s an international agreement for identification of people. A set of 13-15 microsatellites located in the human genome (always compare the same set of microsatellites).
They are able to identify persons, individuals. That’s because there’s extremely variation here and each of us have different combinations of repetitions of these sequences. Genetic variation of microsatellites.
The generation of the variability of the microsatellites, it’s not mainly the sequences; it’s in the number of different repetitions. It’s an error in the replication of DNA.
2 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA Two strands of DNA and because there are some repetitions the T is skipped, one of the strands gives the correct and the other one gives a repetition.
Transposable elements or interspersed repeats We have two main kinds of transposons.
- Class I: one class of transposons based on RNA. They are retrotransposons.
Class II: DNA transposons They are 50% of our genome and they are clearly related to viruses. One of the differences of these transposons is the way that they are spread along the genome.
The transposons are a copy and paste mechanism called replicative transposition.
These ones can cut and paste conservative transposition.
The class I or retrotransposons goes through the mRNA (that’s why we call it RNA transposons) and the original position is maintained. Instead, the DNA transposons, the sequences of DNA can move to other position but is nothing left behind. There is a cut of the original sequences and it’s placed in other positions.
There’s a clearly relation with the genome size, because the replicative transposition increase the genome size in class I but in class II the genome size is maintained.
We have a sequence out of a transposon and it’s called flanking direct repeat and it’s not the sequence inside the transposable element, and there is a terminal inverted repeat located inside the transposon (in both types of transposon).
When a transposon is moved to a new position they generate this flanking direct repeat in the adjacent sequences. The two types of transposons do the same. So when the 3 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA transposon is moved out of this position these flanking direct repeats are left behind (it’s a footprint that a transposon has been located before). For class II it increases a little bit the genome size because they left these new generated flanking repeat.
Class I increase the genome size a lot because they use a copy and paste mechanism (VIRUS). The class I is a cut and paste and, in fact, we shouldn’t have a genome size increased because it’s cut and put in a different position.
A new transposon is placed in the genome and it’s able to generate these 2 sequences (flanking direct repeat). Again, these sequences of transposon can move. The flanking direct repeats stay.
The difference is that a transposon is moved to another place and in the original sequence is deleted in class II but in class I the original sequence is maintained (both the FDR and the transposon). So there is a slight genome size increase in the class II transposons.
We have the transposons that both share the same structure. Then we have another classification for transposons: - Autonomous transposons: they are transposons that have the full function; they have all the enzymes for the transposition mechanism. It’s completely functional.
Non-autonomous transposons: transposons which the codifying sequence is mutated or short and they don’t synthetize the enzyme for transposition so this elements use the enzymes produced by autonomous elements to be placed in another position. They have the signals, all the elements for making transpositions but they can’t codify a full set of enzymes for transpositions but they can change their place because they use the enzymes produced by autonomous transposons.
The genetic material in class II is based in DNA and we can have autonomous or non-autonomous. The genetic material in class I is based in RNA and we can have autonomous or non-autonomous. We have different kind of transposons having all the enzymes, the complete codifying sequences, for the enzymes needed for the transpositions (autonomous transposons).
All of them can change positions, some of them because for themselves they can do it or because they borrow the enzymes from the autonomous elements.
Retrotransposons (Class I) (transposons based in RNA, they copy and paste/replicative mechanism of transposition and large genome size). We have two main types: - LTR retrotransposons in these retrotransposons, we have long terminal repeats, these sequence in LTR is long, up to 5 kb.
Ty1-copia-like (Pseudoviridae) , Ty3-gypsy-like (Metaviridae), and Pao-BEL-like - Non-LTR retrotransposons in these retrotransposons the terminal repeat are short, these sequence here is short and then here we have two different kinds in non-LTR: SINES: Non-LTR transposons but long interspersed elements (Alu, MIR, MIR30 and <500 bases) 4 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA LINES: Non-LTR transposons but short interspersed elements (-1, -2, -3 and 6,000 bases) They both have short terminal inverted repeat sequences because they are non-LTR transposons. So the difference between LTR and non-LTR is the size of the inverted repeat sequence.
Transposable elements – Structure Class II Outside of the transposon, in each side, we have the direct repeat element, it’s produced after transposition. For these classes of transposons we have a single open reading frame and it codifies the transposase (the enzyme that has the function of transposition).
Class I we have LTR retrotransposons and non LTR retrotransposons. In these last we have SINES and LINES. Also we have autonomous and non-autonomous.
Autonomous and functional retrotransposon: HERV have long terminal repeat sequences flanking the retrotransposon and outside of the retrotransposon we have the target side duplication (TSD) and they are produced after transposition, it has the same function than direct repeat element. Inside the transposon we have only one reading frame. They codify for different enzymes. It’s very similar to a virus. We have virus DNA in our genome. Gut is a group of a specific antigen, protease (Prt), polymerase (Pol) and an endonuclease and the proteins for the envelope of the virus capsid. Some of the sequences are shared between different genes.
Long terminal repeat retrotransposons (LINES). We have the autonomous ones usually called LINES. The difference here is that we have two open reading frames and 5 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA they don’t have the long terminal repeat transposons but they have a 5’ UTR and a 3’ UTR, before and after the open reading frames. In the open reading frame 2, we have the endonuclease, the retrotranscriptase, and a very important structure for these LINES, the polyA tail, which is a structure of the eukaryotic mRNA. They also have a TRD, which is generated after the transposition and it is outside.
Then we have the non-autonomous transposons (SINES): the element is extremely short, about 300 base pair and the most common or more frequent in human genome is the Alu element. This has the polyA tail. It’s an important feature to explain the function of transposon. It’s an important fact that non-LTR have polyA tail. They have only one enzyme.
LTR transposons are very similar to a virus. So they have all the structures for having a function very similar as a virus. They have all the different proteins, polymerases, to have a function as a virus.
Virus life cycle: we have a LTR in our genome; there is a transcription of all the mRNA and for all the enzymes included in a transposon. This transcription is made by cell transcription mechanism and there is a translation, using the mechanism of the cell and then we have all the proteins, GAG, capsids… They finally have a function of a virus.
This is reverse transcribed to cDNA to the formation of the capsid and here VLP stands for virus like particle. And then again there is the integration of this cDNA in a new region of the genome. (Classic life cycle of a virus).
About the mechanism of transposition of a DNA of class II transposons it’s quite different. Remember class II transposons is a cut and paste transposition and it has a single open reading frame with only one enzyme codified in this open reading 6 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA frame, the transposase. In this case there’s no RNA, it’s only DNA. The transposon, again using the cellular mechanism of transcription and translation it generates the transposase. The transposase binds to the two ends, the 5’ and the 3’ end of the transposon and a mechanism similar to the splicing cuts the transposon. And then, this transposon (made by a cut and paste mechanism) can be inserted in a new location in the genome. When it’s inserted, here it generates the flanking tandem repeats so they cut the sequence here and they place it in a new location of the genome.
DNA transposons are very important in the history of genetics because DNA transposons were the first transposons discovered.
Either DNA or RNA, class I or class II were first discovered in DNA transposons. The discovery was made by Barbara McClintock and she won the Nobel Price. She discovered that, the transposons, in the genetics of the maize color (Zea mays). The colors of the grains are two: white and black. The maize color it’s because a DNA transposon. Here we have a three different structures or sequences in the genome: - Ac a sequence in the genome called activator Ds a sequence in the genome called dissociator DNA transposon W a gene that codifies for a color In the first case we don’t have activator, there’s no Ac element and then we have the transposon here and the gene. When there’s no activator, nothing happens. We have the gene that codifies for the clear color. We obtain white phenotype.
But we can have some cells with the Ac in their genome. If the activator is present, then this activates the transposition of the Ds transposon and the Ds transposon targets or it’s inserted just in front of the W gene. When the Ds it’s inserted here it produces a break in the chromosome and then this genomic region is lost so it’s lost the transposon and also the W gene and we obtain the black phenotype.
7 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA A third possibility is that we have the presence of the activator and the Ds transposon is inserted in the W gene. W gene has no longer function because it has the transposon inside the gene and then we have the phenotype dark.
And then we have a final possibility, if the activator is still functional it can move the transposon out of the gene. It can remove the transposon, Ds, out of the W gene. So the W gene has then function because there’s nothing inside and we are back with the white phenotype.
The abundance of transposons in our genome is really important. Almost 45% of our genome is transposons sequences but we have two main kinds of it. SINES (Alu) and LINE (LINE-1). There are non-autonomous Alu elements (more than 10%) which are very important on our evolution. LINES non-LTR transposons (more than 17%) and the least important are DNA transposons (3-4% of our genome).
A Retrotransposition in cis and it’s a LINE element, an autonomous LINE element. If everything has a correct function then we have a complete retrotransposition but we can have some mistakes and this LINE element can be introduced in a new region of the genome but partially or including some mutations. So this retrotransposition in cis can produce an autonomous LINE or it can produce a non-autonomous LINE element.
B The second is a retrotransposition in trans and we have some non-autonomous elements in our genome. So they have to use, to be moved to another position the mechanism of an autonomous element. An autonomous element can move a nonautonomous element. But not also they can move this non-autonomous element; they can also produce a processed pseudogene. Using the mechanism of an autonomous element, the messenger of a gene can be inserted in a new position producing a processed pseudogene. mRNA (gene) reverse transcriptase cDNA (copy DNA) inserted in a new position (no introns).
LINE autonomous element and inside of it we have mainly the reverse transcriptase.
The function of this enzyme is transcribing the RNA to cDNA. Then we have a gene, any gene of our genome with exons and introns. This gene codifies for an mRNA only 8 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA with exons. This mRNA can be copy back to cDNA because of this reverse transcriptase. The structure of the cDNA is exactly like the mRNA but this is a DNA sequence. It can be included or produced to a new position in the genome processed pseudogene (it can be functional).
C The retrotransposon can be inserted inside a new gene and this is very important into a gene because it can generate premature stop codons (*) so when a retrotransposon it’s inserted into a gene, inside a gene, it can change the reading frame and then generating premature stock codons. So changing for sure the functionality of a gene.
- Possible functions of the transposons. It can be moved completely (the entirely sequence) or it can be moved partially.
Non-autonomous transposons can use the mechanism of the autonomous transposons to move to another place.
We can obtain processed pseudogenes moved to another place.
A Cis transposition can be placed into a gene. Inserted into an exon and incorporating these premature stop codons (PSC). The asterisks (*) are new stop codons in the gene.
D Then we have this mechanism which is called 3’ transduction, here you have to remember that in LINES they usually have this polyA tail. Also LINES are one of the most abundant transposons in the human genome, so we have a lot of LINES in our genome and they have this polyA tail. But in some LINES this polyA it’s weak, it’s no long (is short). Not in all, in some. When that happens, we have a weak 9 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA polyadenilation signal. The transcription continues through adjacent sequences of the transposon.
When we have a polyA tail this is a signal for transcription stop. If we have a normal or a classical LINE, when the transcription finds this large tail it stops the transcription of this sequence. But in some instance, these stop signals are not enough strong because this polyA tail is short. So we have a weak signal of transcription interruption.
Then, the transcription of the transposon includes some adjacent sequences located after the transposon, so the transposition includes the transposon itself and other sequences located after that. Usually they are exons of a gene. Then, the transposon + exon can be incorporated to a new gene. This gene has a new exon. This mechanism is called exon shuffle. Genes can have exons because of this mechanism of 3’ transduction.
E Finally we have other possibility, a transposon (usually a LINE) can change the direction of a specific DNA sequences. So having an inversion or a gene rearrangement, and at the end one of the possible functions of LINES, if there is some signals of genes transcription (LINES have 5’ UTR sequences near of the promoter) the promoter can affect the LINE and the two other adjacent genes. (F) They can promote the transcription of adjacent genes. It can impact on other genes. EXAM C & D Another function of the transposon is the potential genomic rearrangements, changing the structure of the genome (not only the function also the structure of the genome).
10 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA Repeat sequences transposons are sequences repeated in our genome and then there are repeated sequences, in meiosis there is a very important genetic mechanism, recombination, which usually occurs in homologous sequences.
Repeat sequences and obviously transposons have a huge impact in processing recombination errors or non-homologous recombination.
If we have a sequence of DNA with two repeat sequences or two transposons, transposon element 1 and transposon element 2, and the sequences of these two transposons are very similar because they are repeat sequences. In the homologous chromosome we also have these two transposons (TE = transposon element). What we expect in a normal recombination but not meiosis is that it can change (recombination between TE1 and TE1). Because these two sequences are very similar the recombination can be between TE 1 and TE2, non-homologous recombination or recombination error in meiosis. The result on that is that the edge changes the DNA material to the sequences below. And we have the other possibility.
Between homologous chromosomes but not between homologous sequences nonhomologous recombination. This generates tandem repeats. This process is called intra-chromosomal non homologous recombination If that can occur between two repetitive elements located in two homologous chromosomes, it can also occur between these two repetitive elements in the same chromosome. It’s again a non-homologous recombination. Because these two sequences (1 and 2) are repeated sequences they are similar in their sequences. The mechanism of recombination in meiosis can be confused and it can produce a recombination between these two sequences located in the same chromosome. Then, we have a process called inter-chromosomal non homologous recombination.
It’s always the same mechanism: we have repeated sequences located in the genome, these sequences can produce errors in the recombination of meiosis and they can have different possibilities. In the intra-chromosomal recombination between two inverted repeats occurs that if we have several tandem repeats we can increase it producing tandem repeat polymorphisms. The tandem repeats can increase their size.
All of them are non-homologous recombination errors of meiosis. Generating germ cells (cèl·lules germinals).
We have function of transposons disrupting genes and changing our genome (Functions (?) of transposons) genome rearrangements. Specific examples of that: - - For instance, L1 is the most abundant LINE element in our genome. In our genome there are only 100 active elements. Most of our TE elements are silenced.
They have some capacities. For instance, an insertion of L1 into a factor VIII (coagulation of the blood) causes hemophilia. One transposon is inserted in that particular gene and we obtain as a result of this, hemophilic people.
In some cancers we have an L1 inserted in APC genes (tumor suppressor genes) and it’s found in colon cancer cells Alu is a SINE (non-autonomous) and when it’s inserted in the intron of this gene produces neurofibromatosis.
11 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA So we have some specific functions of transposons.
They have other mechanisms for silencing or not transcribing transposons: The first one is mutations, if you incorporate or there is some mutated, some kind of mutations in these protein coding genes we have the transposon silenced. We have some specific epigenetic mechanisms like DNA methylation… and small interfering RNAs that prevent transposition (some of these are siRNA that target the 5’ UTR of the transposons).
One very important function of transposons it’s in our genome and it’s related to the Alu element (SINE, non-autonomous). These Alu (specific SINES) are only found in primate genomes. Only in human and related species, there are no Alu in other species. The first Alu appeared about 65 Myr. Before of that there is no Alu element in other species. This Alu element has a very important function in the evolution of these organisms called “Exonization” which is the process of producing new exons, generating new exons.
Short Interspersed Elements (SINEs): Alu Nonautonomous retrotransposon Exonization: 62% of all new exons Primate (65Myr) The Alu element, the sequence of the Alu element is very similar to the sequence of the splicing signal. It has some common features between Alu sequence and the 12 Natalia Mingorance García 3r Biologia – UdG UNYBOOK: nattymg23 GENÒMICA splicing signal in humans. When the Alu sequence is placed between two previous exons usually nothing occurs. So we have two exons and in an intron a new transposon element there. Because it has extremely similar DNA sequences to the splicing signals, with a few mutations in this Alu sequences then the splicing mechanism get confused and it recognize it as a new exon. Then it generates different alternative splicing isoforms or transcripts. Isoform = transcripts after splicing mechanism.
So the Alu in this kind of organisms is very important for producing new exons because the Alu sequence is very similar to the splicing signal, which is called exonization. 62% of the new exons, the new generated exons are produced by this mechanism.
There is another example of function of the transposons, which is reading (Syncytin gene. Reading, Labtimes 01 2017 article).
Summary – Non-coding DNA 1.
Types of Noncoding DNA Tandem and interspersed repeats Transposons. Classification Transposons. Structure Transposons. Mechanisms Transposons. Genomic rearrangements Transposons. Functions 13 ...