Palindromic sequence
A palindromic sequence is a nucleic acid sequence on double-stranded DNA or RNA wherein reading 5' (five-prime) to 3' (three prime) forward on one strand matches the sequence reading backward 5' to 3' on the complementary strand with which it forms a double helix. This definition of palindrome thus depends on complementary strands being palindromic of each other.
The meaning of palindrome in the context of genetics is slightly different from the definition used for words and sentences. Since a double helix is formed by two paired strands of nucleotides that run in opposite directions in the 5'-to-3' sense, and the nucleotides always pair in the same way (Adenine (A) with Thymine (T) for DNA, with Uracil (U) for RNA; Cytosine (C) with Guanine (G)), a (single-stranded) nucleotide sequence is said to be a palindrome if it is equal to its reverse complement. For example, the DNA sequence ACCTAGGT is palindromic because its nucleotide-by-nucleotide complement is TGGATCCA, and reversing the order of the nucleotides in the complement gives the original sequence.
A palindromic nucleotide sequence can form a hairpin. Palindromic DNA motifs are found in most genomes or sets of genetic instructions. Palindromic motifs are made by the order of the nucleotides that specify the complex chemicals (proteins) which, as a result of those genetic instructions, the cell is to produce. They have been specially researched in bacterial chromosomes and in the so-called Bacterial Interspersed Mosaic Elements (BIMEs) scattered over them. Recently, a research genome sequencing project discovered that many of the bases on the Y chromosome are arranged as palindromes. A palindrome structure allows the Y chromosome to repair itself by bending over at the middle if one side is damaged.
Palindromes also appear to be found frequently in proteins,[1][2] but their role in the protein function is not clearly known. It has recently [3] been suggested that the existence of palindromes in peptides might be related to the prevalence of low-complexity regions in proteins, as palindromes are frequently associated with low-complexity sequences. Their prevalence might be also related to an alpha helical formation propensity of these sequences,[3] or in formation of protein/protein complexes.[4]
Examples
Restriction enzyme sites
Palindromic sequences play an important role in molecular biology. Because a DNA sequence is double stranded, the base pairs are read, (not just the bases on one strand), to determine a palindrome. Many restriction endonucleases (restriction enzymes) recognize specific palindromic sequences and cut them. The restriction enzyme EcoR1 recognizes the following palindromic sequence:
5'- G A A T T C -3' 3'- C T T A A G -5'
The top strand reads 5'-GAATTC-3', while the bottom strand reads 3'-CTTAAG-5'. If the DNA strand is flipped over, the sequences are exactly the same ( 5'GAATTC-3' and 3'-CTTAAG-5'). Here are more restriction enzymes and the palindromic sequences which they recognize:
Enzyme | Source | Recognition Sequence | Cut |
---|---|---|---|
EcoR1 | Escherichia coli |
5'GAATTC 3'CTTAAG |
5'---G AATTC---3' 3'---CTTAA G---5' |
BamH1 | Bacillus amyloliquefaciens |
5'GGATCC 3'CCTAGG |
5'---G GATCC---3' 3'---CCTAG G---5' |
Taq1 | Thermus aquaticus |
5'TCGA 3'AGCT |
5'---T CGA---3' 3'---AGC T---5' |
Alu1* | Arthrobacter luteus |
5'AGCT 3'TCGA |
5'---AG CT---3' 3'---TC GA---5' |
* = blunt ends |
Methylation sites
Palindromic sequences may also have methylation sites. These are the sites where a methyl group can be attached to the palindromic sequence. Methylation makes the resistant gene inactive; this is called Insertional Inactivation or Insertional mutagenesis. For example, in PBR322 methylation at the tetracyclin resistant gene makes the plasmid liable to tetracyclin; after methylation at the tetracyclin resistant gene if the plasmid is exposed to antibiotic tetracyclin, it does not survive.
Palindromic nucleotides in T cell receptors
Diversity of T cell receptor (TCR) genes is generated by nucleotide insertions upon rearrangement from their germ line-encoded V, D and J segments. Nucleotide insertions at V-D and D-J junctions are random, but some small subsets of these insertions are exceptional, in that one to three base pairs inversely repeat the sequence of the germline DNA. These short complementary palindromic sequences are called P nucleotides.[5]
References
- ↑ Ohno S (1990). "Intrinsic evolution of proteins. The role of peptidic palindromes". Riv. Biol. 83 (2-3): 287–91, 405–10. PMID 2128128.
- ↑ Giel-Pietraszuk M, Hoffmann M, Dolecka S, Rychlewski J, Barciszewski J (February 2003). "Palindromes in proteins" (PDF). J. Protein Chem. 22 (2): 109–13. doi:10.1023/A:1023454111924. PMID 12760415.
- 1 2 Sheari A, Kargar M, Katanforoush A, et al. (2008). "A tale of two symmetrical tails: structural and functional characteristics of palindromes in proteins". BMC Bioinformatics 9: 274. doi:10.1186/1471-2105-9-274. PMC 2474621. PMID 18547401.
- ↑ Pinotsis N, Wilmanns M (October 2008). "Protein assemblies with palindromic structure motifs". Cell. Mol. Life Sci. 65 (19): 2953–6. doi:10.1007/s00018-008-8265-1. PMID 18791850.
- ↑ Srivastava, SK; Robins, HS (2012). "Palindromic nucleotide analysis in human T cell receptor rearrangements.". PloS one 7 (12): e52250. PMID 23284955.