Biopolym. Cell. 2008; 24(5):368-376.
Structure and Function of Biopolymers
Comparative analysis of nucleotide composition skew in exons and introns of human genes
1Duplij D. R., 2Kalashnikov V. V., 1Chaschyn O. I., 3Tolstorukov M. E.
  1. Institute of Molecular Biology and Genetics, NAS of Ukraine
    150, Akademika Zabolotnoho Str., Kyiv, Ukraine, 03680
  2. Kharkiv National University of Economics
    9a, Lenin Avenue, Kharkov, Ukraine, 61166
  3. Kharkiv National University
    4, Svobody Ave., Kharkiv, Ukraine, 61077


An analysis of transcription-related bias of base-pair composition was performed in different functional regions of human genes. Four types of the bias (skew indexes) were considered: AT-skew, GC-skew, Purine-skew, and Keto-skew. Our results show the essential differences between base-pair composition of exons and introns. On average, exons are characterized by the following rules: A > T; G ≥ C; A + G > C + T; A + C ≥ G + T, while the rules for introns are: A < T; C ≤ G; A + G ≤ C + T; A + C < G + T. The indexes reach the highest values in the internal introns and are close to zero in nontranscribed regions of the genome. We also observed that the bias is pronouncedly stronger in the housekeeping genes than in the analyzed groups of tissue-specific genes. Our results suggest that the detailed knowledge of the base-pair compositional bias may help to further our understanding of the overall gene organization.
Keywords: exons, introns, nucleotides, skew


[1] Sueoka N. Directional mutation pressure, selective constraints, and genetic equlibria J. Mol. Evol 1992 34:95–114.
[2] Grigoriev A. Analyzing genomes with cumulative skew diagrams Nucl. Acids Res 1998 26:2286–2290.
[3] Mrazek J., Karlin S. Strand compositional asymmetry in bacterial and large viral genomes Proc. Nat. Acad. Sci. USA 1998 95:3720–3725.
[4] Mackiewicz P., Mackiewicz D., Kowalczuk M. Flip-flop around the origin and terminus of replication in prokaryotic genomes Genome Biol. 2001;2(12):INTERACTIONS1004.
[5] Zawilak A., Cebrat S., Mackiewicz P. Identification of a pupative chromosomal replication origin from Helicobacter pylori and its interaction with the initiator protein DNA Nucl. Acids Res 2001 29:2251–2259.
[6] Ma X. R., Xiao S. B., Guo A. Z., Lv J. Q., Chen H. C. DNAskew: statistical analysis of base compositional asymmetry and prediction of replication boundaries in the genome sequences Acta Biochim. et Biophys. Sin. (Shanghai) 2004 36:16–20.
[7] Francino M. P., Ochman H. Strand asymmetries in DNA evolution Trends Genet 1997 13:240–245.
[8] Mushkambarov NN, Kuznetsov SL. Molecularnaya biologiya. M.: MIA, 2003. 529p
[9] Svejstrup J. Q. Transcription repair coupling factor a very pushy enzyme Mol. Cell 2002 9:1151–1152.
[10] Beletskii A., Bhagwat A. S. Transcription-induced cytosine- to-thymine mutations are not dependent on sequence context of the target cytosine J. Bacteriol 2001 183:6491–6493.
[11] Touchon M., Nicolay S., Arneodo A., d'Aubenton-Cafara Y., Thermes C. Transcription-coupled TA and GC strand asymmetries in the human genome FEBS Lett 2003 555:579–582.
[12] Green P., Ewing B., Miller W., Thomas P. J., NISC Comparative Sequencing Program, Green E. D. Transcrip- tion-associated mutational asymmetry in mammalian evolution Nat. Genet 2003 33:514–517.
[13] Zhang M. Q. Statistical features of human exons and their flanking regions Hum. Mol. Genet 1998 7:919–932.
[14] Castillo-Davis C. I., Mekhedov S. L., Hartl D. L., Koonin E. V., Kondrashov F. A. Selection for short introns in highly expressed genes Nat. Genet 2002 31:415–418.
[15] Fujimori S., Washio T., Tomita M. GC-compositional strand bias around transcription start sites in plants and fungi BMC Genomics 2005 6:26.
[16] Louie E., Ott J., Majewski J. Nucleotide frequency variation across human genes Genome Res 2003 13:2594– 2601.
[17] International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome Nature 2004 431:931–945.
[18] Eisenberg E., Levanon E. Y. Human housekeeping genes are compact Trends Genet 2003 19:362–365.
[19] Hsiao Li-Li, Dangond F., Yoshida T., Hong R., Jensen R. V., Misra J., Dillon W., Lee K. F., Clark K. E., Haverty P., Weng Z., Mutter G. L., Frosch M. F., MacDonald M. E., Milford E. L., Crum C. P., Bueno R., Pratt. E., Mahadevappa M., Warrington J. A., Stephanopoulos Gr., Stephanopoulos G., Gullans S. R. A compendium of gene expression in normal human tissue. Physiol Genomics. 2001;7(2):97-104.
[20] Gaydyshev I. Analysis and processing: a special directory. St. Petersburg: Piter, 2001. 752p.
[21] Gerstein M. B., Bruce C., Rozowsky J. S., Zheng D., Du J., Korbel J. O., Emanuelsson O., Zhang Z. D., Weissman S., Snyder M. What is a gene, post-ENCODE? History and updated definition Genome Res 2007 17:669–681.