Intrastrain and interstrain genetic variation within a paralogous gene family in Chlamydia pneumoniae
© Viratyosin et al; licensee BioMed Central Ltd. 2002
Received: 27 August 2002
Accepted: 2 December 2002
Published: 2 December 2002
Chlamydia pneumoniae causes human respiratory diseases and has recently been associated with atherosclerosis. Analysis of the three recently published C. pneumoniae genomes has led to the identification of a new gene family (the Cpn 1054 family) that consists of 11 predicted genes and gene fragments. Each member encodes a polypeptide with a hydrophobic domain characteristic of proteins localized to the inclusion membrane.
Comparative analysis of this gene family within the published genome sequences provided evidence that multiple levels of genetic variation are evident within this single collection of paralogous genes. Frameshift mutations are found that result in both truncated gene products and pseudogenes that vary among isolates. Several genes in this family contain polycytosine (polyC) tracts either upstream or within the terminal 5' end of the predicted coding sequence. The length of the polyC stretch varies between paralogous genes and within single genes in the three genomes. Sequence analysis of genomic DNA from a collection of 12 C. pneumoniae clinical isolates was used to determine the extent of the variation in the Cpn 1054 gene family.
These studies demonstrate that sequence variability is present both among strains and within strains at several of the loci. In particular, changes in the length of the polyC tract associated with the different Cpn 1054 gene family members are common within each tested C. pneumoniae isolate. The variability identified within this newly described gene family may modulate either phase or antigenic variation and subsequent physiologic diversity within a C. pneumoniae population.
Chlamydia pneumoniae is an obligate intracellular bacterium that infects and causes disease in the respiratory tract [1, 2] and has recently been associated with heart disease . Approximately 10% of pneumoniae cases and 5% of bronchitis and sinusitis cases in the U.S. are attributed to C. pneumoniae infection. Pathogenic mechanisms utilized by C. pneumoniae to replicate and disseminate within hosts remain unclear.
Little is known about strain-specific determinants of C. pneumoniae. Isolates of C. pneumoniae are virtually indistinguishable using 16s rRNA , restriction fragment length polymorphism , and amplification fragment length polymorphism analysis . Unlike C. trachomatis, only a single serotype or genotype of C. pneumoniae has been identified by any of the above methods.
Recently, three genomes of C. pneumoniae have been completed and published. These include CWL029 http://chlamydia-www.berkeley.edu:4231/, AR39 http://www.tigr.org/, and J138 http://w3.grt.kyushu-u.ac.jp/J138/. Comparative analysis suggests that overall genomic organization and gene order in each C. pneumoniae genome is highly conserved [8, 9]. Given this conservation, the study of individual regions of sequence variation will provide insight into strain-specific virulence, genetic diversity, and adaptive responses within and among C. pneumoniae populations.
Genomic analyses have recently revealed a large gene family of 21 polymorphic outer membrane proteins (Pmps) with predicted outer membrane localization in C. pneumoniae [7, 10, 11]. The function of this gene family in chlamydial growth and development remains unknown. Several studies have examined genetic variation and strain differentiation of Pmp proteins, which may be important for genetic flexibility and adaptive response. Recently, it has been reported that interstrain and intrastrain variation of gene expression and protein productions of pmpG 6 and pmpG 10 are modulated by deletion of tandem repeats in pmpG 6 [9, 11] and variation in the length of polyguanosine tract in pmpG 10 [12, 13]. This evidence suggests that variation may be an important requisite for the function of this gene family in the biology of Chlamydiae.
Examination of the C. pneumoniae genome sequences by Daugaard et al.  demonstrated that a unique and related family of genes is found within the C. pneumoniae genome, and that variation among strains leads to differences within several members of the gene family. Gene products of these paralogous genes contained a unique bi-lobed hydrophobic domain, which is a predictive marker for localization to the inclusion membrane . In this study, we further characterize this family by examining variation in sequence of the family members both within and among different C. pneumoniae isolates.
Bioinformatic analysis of the Cpn 1054 gene family
Percent DNA sequence similarity of selected paralogous genes in the Cpn 1054 gene familya.
The conserved hydrophobic domain of the Cpn 1054 gene family
Homopolymeric cytosine (poly C) tract and the variations of the length of poly C tracts in the Cpn 1054 gene family
Two approaches were used to demonstrate that the observed variation in length of the polyC tract was not a function of PCR errors during the analysis. First, two different thermostable polymerases (Taq and Pwo polymerase) were used to generate the primary amplification products for cloning and subsequent sequence analysis. Amplifications with each enzyme resulted in clones with variation in the length of the polyC tract (Figure 6C). A second approach for examination of the possibility of PCR errors was to reamplify the polyC tract from a single plasmid template, and examine the sequence of the polyC tract directly in these amplification products. No variation of the length of the polyC tract of Cpn 043, 1054, and 1055 was identified in these PCR products (not shown). These results support the conclusion that the variability in the length of the polyC tracts is not an artifact of the amplification process, and thus the observed variability reflects differences at these loci within individual isolates.
Allelic differences within Cpn 010–010.1
Relatively little is known about molecular pathogenesis, genetic diversity and adaptive strategy of C. pneumoniae. Although the genomic organization of these independent strains is very similar (over 99.9% identical), there are regions of variation within each isolate . In the present study, we have identified a paralogous gene family within C. pneumoniae, designated as the Cpn 1054 gene family. This family consists of eleven paralogous loci, with single repeat elements consisting of single ORFs or ORF pairs. The identity of the predicted polypeptide sequences shared among family members ranges from 20–99%. It is likely that the diversity of these genes arose through gene duplication and subsequent diversification. It appears that certain duplications were relatively recent, as at least two of the repeated loci- Cpn 010/10.1 and Cpn 1054- are nearly identical. Analysis of the three genomes also demonstrates that apparent gene conversion has occurred between 10/10.1 and 1054 in strain AR39 , and that an intact 1054 ORF is found within each sequenced genome . However, its location varies between the two loci. The redundant nature of the Cpn 1054 family members is somewhat unusual against the generally reductive evolutionary strategy of the chlamydiae . There is no evidence that the Cpn 1054 gene family is found outside of C. pneumoniae, and thus the members of the family may be important in the unique biological traits of this species.
Gene duplication and subsequent genetic drift are the likely means by which variation is manifested between members of the Cpn 1054 gene family. Variation is also observed within individual family members, both between strains  and, as shown in this report, within individual isolates. Several gene family members, including Cpn 008, 010, 043, 1054, and 1055, contain homopolymeric cytosine repeats either upstream or at the predicted 5' end of the coding region. In C. pneumoniae, variation of the short repeat of homopolymeric nucleotides was first identified in the pmp family. Comparative genomic analysis and cloning expression showed that the length of the polyG tract of pmpG 10 varies between strains and within an isolate [12, 13]. Furthermore, variation of the length of polyG has been demonstrated that it plays a role in the differential expression of PmpG 10 . Variability in short nucleotide repeats generated via slipped strand mispairing are key elements in the generation of phenotypic diversity within many pathogenic microorganisms . Further investigation will be required to determine if the expression of members of the Cpn 1054 gene family is affected by the observed variability in length of the polyC tract.
Although the proteins in the Cpn 1054 gene family are classified as candidate inclusion membrane proteins , their subcellular locations and role in infection and disease remain to be identified. It is also not yet known whether the Cpn 1054 gene family is expressed individually or coordinately, or to what extent each gene is expressed during the course of an infection. However, the variation, both within and between strains, is a potential requisite for this gene family that may contribute to the unique biology of C. pneumoniae.
The C. pneumoniae genome contains a gene family (the Cpn 1054 gene family) consisting of 18 different genes in 11 paralogous loci. Variation is observed both within and among isolates. This variation may be useful for the biotyping of C. pneumoniae clinical isolates, and may be important in phenotypic diversity within the species.
Bacterial strains, plasmids and C. pneumoniae genomes
All experiments were conducted using a collection of independent clinical isolates from a strain library (Table 1). Genomic DNA of C. pneumoniae was isolated from purified EBs using the methods of Campbell et al. . Extracted DNA was stored at -20C. Genomics analyses were conducted using sequences from the genome websites listed in the introduction. Open reading frames, nucleotide positions and contig numbering were annotated based upon the C. pneumoniae CWL029 genome .
Bioinformatics analysis of the Cpn 1054 gene family of C. pneumoniae genomes
DNA and polypeptide sequences were aligned using CLUSTALW analysis (Mac Vector™ 6.0; Oxford Molecular, Genetics Computer Group, Inc. Madison, WI). Each gene and each predicted gene product was also subjected to gap BLASTX and BLASTN respectively http://0-www.ncbi.nlm.nih.gov.brum.beds.ac.uk/BLAST/. The similarity between two different DNA sequences was determined using the BLAST 2 sequence program from http://0-www.ncbi.nlm.nih.gov.brum.beds.ac.uk/blast/bl2seq/bl2.html. Hydrophilicity profiles of the gene product of each Cpn 1054 family member was determined using hydropathy plot analysis ( MacVector™).
Phylogenic analyses of both DNA and amino acid sequences were performed using PAUP* . In this study Cpn 0186 (IncA), and four additional candidate inclusion membrane proteins, Cpn 0284, Cpn 0285, Cpn 0829 and Cpn 0830, were selected as members of an outgroup for analysis. Phylogenic trees were inferred by neighbor-joining to estimate evolutionary distances. Bootstrap values were obtained from a consensus of 100 neighbor-joining trees.
DNA amplification, and sequence analysis of the CP1054 gene family
Description of the C. pneumoniae isolates used in this study
acute respiratory infection
acute respiratory infection
acute respiratory infection
acute respiratory infection
Examination of variation within isolates through cloning of PCR products
The variation of the length of the polyC tract within Cpn 043, 1054 and 1055 was determined through sequence analysis of purified amplification products and through sequencing of amplification products following cloning into plasmids. Both Taq polymerase (Promega, Madison, WI) and Pwo polymerase (Roche Diagnostic Corporation, Indianapolis, IN) were used in these studies. Amplification products generated with Pwo polymerase were cloned using the Zeroblunt system, while products generated with Taq were cloned into pCRII (Invitrogen, Carlsbad, CA). Both cloning systems were used according to the manufacturer's instructions. Plasmid DNA of 8–10 different positive recombinant clones was isolated and sequenced. Variations of the length of the polyC tract of Cpn 043, Cpn 1054 and Cpn 1055 within C. pneumoniae AR39 were determined for 8–10 recombinant clones.
Nucleotide sequence accession numbers
The nucleotide sequences of variants within Cpn 010 identified from independent clinical isolates were deposited in GenBank under following accession numbers : AF474017 through 474026, and AF 461543 through 461552.
Oligonucleotide primer pairs used for PCR and sequencing in this study.
Region amplified a
5'-3' sequence of primers
This work was supported by U.S.P.H.S. Awards # AI42869 and AI48769, and the Oregon State University Department of Microbiology N.L. Tartar Award Program. W.V. was sponsored through the Royal Thai Scholarship program from the Thailand National Center of Genetic Engineering and Biotechnology. We acknowledge John Bannantine and members of the Rockey laboratory for critical reading of the manuscript.
- Grayston JT, Aldous MB, Easton A, Wang SP, Kuo CC, Campbell LA, Altman J: Evidence that Chlamydia pneumoniae causes pneumonia and bronchitis. J Infect Dis. 1993, 168: 1231-1235.View ArticlePubMedGoogle Scholar
- Kuo CC, Jackson LA, Campbell LA, Grayston JT: Chlamydia pneumoniae (TWAR). Clin Microbiol Rev. 1995, 8: 451-461.PubMed CentralPubMedGoogle Scholar
- Saikku P: Chlamydia pneumoniae and atherosclerosis – an update. Scand J Infect Dis. 1997, 104: 53-56.Google Scholar
- Pettersson B, Andersson A, Leitner T, Olsvik O, Uhlen M, Storey C, Black CM: Evolutionary relationships among members of the genus Chlamydia based on 16S ribosomal DNA analysis. J Bacteriol. 1997, 179: 4195-4205.PubMed CentralPubMedGoogle Scholar
- Meijer A, Kwakkel GJ, de Vries A, Schouls LM, Ossewaarde JM: Species identification of Chlamydia isolates by analyzing restriction fragment length polymorphism of the 16S-23S rRNA spacer region. J Clin Microbiol. 1997, 35: 1179-1183.PubMed CentralPubMedGoogle Scholar
- Meijer A, Morre SA, van den Brule AJ, Savelkoul PH, Ossewaarde JM: Genomic relatedness of Chlamydia isolates determined by amplified fragment length polymorphism analysis. J Bacteriol. 1999, 181: 4469-4475.PubMed CentralPubMedGoogle Scholar
- Kalman S, Mitchell W, Marathe R, Lammel C, Fan J, Hyman RW, Olinger L, Grimwood J, Davis RW, Stephens RS: Comparative genomes of Chlamydia pneumoniae and C. trachomatis. Nat Genet. 1999, 21: 385-389. 10.1038/7716.View ArticlePubMedGoogle Scholar
- Read TD, Brunham RC, Shen C, Gill SR, Heidelberg JF, White O, Hickey EK, Peterson J, Utterback T, Berry K, Bass S, Linher K, Weidman J, Khouri H, Craven B, Bowman C, Dodson R, Gwinn M, Nelson W, DeBoy R, Kolonay J, McClarty G, Salzberg SL, Eisen J, Fraser CM: Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniae AR39. Nucleic Acids Res. 2000, 28: 1397-1406. 10.1093/nar/28.6.1397.PubMed CentralView ArticlePubMedGoogle Scholar
- Shirai M, Hirakawa H, Kimoto M, Tabuchi M, Kishi F, Ouchi K, Shiba T, Ishii K, Hattori M, Kuhara S, Nakazawa T: Comparison of whole genome sequences of Chlamydia pneumoniae J138 from Japan and CWL029 from USA. Nucleic Acids Res. 2000, 28: 2311-2314. 10.1093/nar/28.12.2311.PubMed CentralView ArticlePubMedGoogle Scholar
- Grimwood J, Stephens RS: Computational analysis of the polymorphic membrane protein superfamily of Chlamydia trachomatis and Chlamydia pneumoniae. Microb Comp Genomics. 1999, 4: 187-201.View ArticlePubMedGoogle Scholar
- Grimwood J, Olinger L, Stephens RS: Expression of Chlamydia pneumoniae polymorphic membrane protein family genes. Infect Immun. 2001, 69: 2383-2389. 10.1128/IAI.69.4.2383-2389.2001.PubMed CentralView ArticlePubMedGoogle Scholar
- Pedersen AS, Christiansen G, Birkelund S: Differential expression of Pmp10 in cell culture infected with Chlamydia pneumoniae CWL029. FEMS Microbiol Lett. 2001, 203: 153-159. 10.1016/S0378-1097(01)00341-X.View ArticlePubMedGoogle Scholar
- Stephens RS, Lammel CJ: Chlamydia outer membrane protein discovery using genomics. Curr Opin Microbiol. 2001, 4: 16-20. 10.1016/S1369-5274(00)00158-2.View ArticlePubMedGoogle Scholar
- Daugaard L, Christiansen G, Birkelund S: Characterization of a hypervariable region in the genome of Chlamydophila pneumoniae. FEMS Microbiol Lett. 2001, 203: 241-248. 10.1016/S0378-1097(01)00368-8.View ArticlePubMedGoogle Scholar
- Bannantine JP, Griffiths RS, Viratyosin W, Brown WJ, Rockey DD: A secondary structure motif predictive of protein localization to the chlamydial inclusion membrane. Cell Microbiol. 2000, 2: 35-47. 10.1046/j.1462-5822.2000.00029.x.View ArticlePubMedGoogle Scholar
- Jordan IK, Makarova KS, Wolf YI, Koonin EV: Gene conversions in genes encoding outer-membrane proteins in H. pylori and C. pneumoniae. Trends Genet. 2001, 17: 7-10. 10.1016/S0168-9525(00)02151-X.View ArticlePubMedGoogle Scholar
- Zomorodipour A, Andersson SG: Obligate intracellular parasites: Rickettsia prowazekii and Chlamydia trachomatis. FEBS Lett. 1999, 452: 11-15. 10.1016/S0014-5793(99)00563-3.View ArticlePubMedGoogle Scholar
- Deitsch KW, Moxon ER, Wellems TE: Shared themes of antigenic variation and virulence in bacterial, protozoal, and fungal infections. Microbiol Mol Biol Rev. 1997, 61: 281-293.PubMed CentralPubMedGoogle Scholar
- Campbell LA, Kuo CC, Grayston JT: Characterization of the new Chlamydia agent (TWAR) as a unique organism by restriction endonuclease analysis and DNA:DNA hybridization. J Clin Microbiol. 1987, 25: 1911-1916.PubMed CentralPubMedGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMed CentralView ArticlePubMedGoogle Scholar
- Tatusova TA, Madden TL: Blast 2 sequences – a new tool for comparing protein and nucleotide sequences. FEMS Microbiol Lett. 1999, 174: 247-250. 10.1016/S0378-1097(99)00149-4.View ArticlePubMedGoogle Scholar
- Kyte J, Doolittle RF: A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982, 157: 105-132. 10.1016/0022-2836(82)90515-0.View ArticlePubMedGoogle Scholar
- Swofford DL: PAUP*: Phylogenetic analysis using parsimony and other methods version 4.0b6. Sinauer Associates Inc., Sunderland, MA.Google Scholar
- Campbell JF, Barnes RC, Kozarsky PE, Spika JS: Culture-confirmed pneumonia due to Chlamydia pneumoniae. J Infect Dis. 1991, 164: 411-413.View ArticlePubMedGoogle Scholar
- Kuo CC, Chen HH, Wang SP, Grayston JT: Identification of a new group of Chlamydia psittaci strains Called TWAR. J Clin Microbiol. 1986, 24: 1034-1037.PubMed CentralPubMedGoogle Scholar
- Jackson LA, Campbell LA, Kuo CC, Rodriguez DI, Lee A, Grayston JT: Isolation of Chlamydia pneumoniae from a carotid endartrectomy specimen. J Infect Dis. 1997, 176: 292-295.View ArticlePubMedGoogle Scholar
- Yamazaki T, Nakada H, Sakurai N, Kuo CC, Wang SP, Grayston JT: Transmission of Chlamydia pneumoniae in young children in a Japanese family. J Infect Dis. 1990, 162: 1390-1392.View ArticlePubMedGoogle Scholar
- Ekman MR, Grayston JT, Visakorpi R, Kleemola M, Kuo C-C, Saikku P: An epidemic of infections due to Chlamydia pneumoniae in military conscripts. Clin Infect Dis. 1993, 17: 420-425.View ArticlePubMedGoogle Scholar
- Chirgwin K, Roblin PM, Gelling M, Hammerschlag MR, Schachter J: Infection with Chlamydia pneumoniae in Brooklyn. J Infect Dis. 1991, 163: 757-761.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.