Cite this asR Martins PH, da Silva LP, de Orem JC, de Magalhães MIA, De-Souza MT, et al. (2020) Protein profiling as a tool for identifying environmental aerobic endospore-forming bacteria. Open J Bac 4(1): 001-007. DOI: 10.17352/ojb.000012
Aerobic Endospore-Forming Bacteria (AEFB) are taxonomically and physiologically diverse, comprising species of genus Bacillus and related genera of industrial and medical importance. For taxonomic purpose, we applied the matrix-assisted laser desorption/ionization mass spectrometry with time-of-flight to identify 64 environmental AEFB (SDF for Solo do Distrito Federal) and compare the results with those obtained using 16S rRNA gene sequencing. Concordance between the two methods was observed for 93,75% samples at the genus level. Strains were clustered between 2 genera (family Bacillaceae): Bacillus, the most prevalent, and LysiniBacillus. Two other genera, BreviBacillus and PaeniBacillus (family PaeniBacillaceae) were also distinguished. Gene similarity discriminated an additional genus (RummeliiBacillus). At the species level, the genotyping method achieved superior capacity identifying 93,75% strains. Among 31 strains identified at the species level by protein profiling, 61.29% coincided and both, protein and gene profiling, placed other 32.25% strains within groups of closely related species of Bacillus bearing two or even more species alternatives within the same affiliation cluster. These results suggested the applicability of the score and sequence similarity ranges in a complementary way for initial identification and clustering of closely related samples inside these 64 SDF strains. Our assignments are useful because they clearly identify the genera and restrict the identity of a strain to one or two possible species in the genera, thus clarifying their genetic interrelationships. This study also stresses that combining phenotypic and genotypic methods into polyphasic approaches is essential for a robust assignment of the remarkable genetic and ecological diversity of AEFB.
Species of the genus Bacillus and related genera are collectively designated Aerobic Endospore-Forming Bacteria (AEFB). Inside the phylum Firmicutes, these species are allocated in the class Bacilli, order Bacillales which contains seven out of ten families harbouring endospore-formers: AlicycloBacillaceae, Bacillaceae, PaeniBacillaceae, Pasteuriaceae, Planococcaceae, SporolactoBacillaceae and Thermoactinomycetacea [1,2]. AEFB are widely distributed in nature, including extreme environments, and the soil is considered their main repository . Bacillus anthracis and B. cereus are known for infecting humans. To highlight the ecological and economic importance of some AEFB strains we can mention a wide range of properties, including nitrogen fixation; plant growth promotion; activity toward insects, nematodes, and fungi; soil phosphorus solubilisation; production of exopolysaccharides, high diversity of hydrolytic enzymes, antibiotics, cytokinins, among other bioproducts [1,2].
AEFB present a high level of genetic and physiological diversity which render the demarcation of genus and species borders very complex [1,2,4,5]. Currently, 16S rRNA gene sequences are used to assign taxa in a phylogenetic tree and draw the largest frontiers in the prokaryotic classification system . However, phenotype can influence the depth of a hierarchical line consistency and is necessary to generate useful characterization [6,7].
Among the phenotype-based methods for the identification of microorganisms, the use of the matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) has dramatically increased [8,9]. Analyses by MALDI-TOF MS do not require lengthy biochemical reactions and are faster than other conventional phenotypic identification methods, presenting similar or even superior reliability . Besides, the toleration of varying growth conditions and the high reproducibility of this technique resulted in the elaboration of standard protocols [10,11]. Indeed, clinical laboratories have been successfully using MALDI-TOF MS to identify microorganisms at the species level, allowing that most of the clinically relevant pathogens to be rapidly included in the spectra database [12-14]. The efficacy of method relies on the stability of mass spectral patterns generated, since some cell components, routinely used on the analyses, are ubiquitous, highly conserved, integral, and abundant in living cells [15,16]. Mass spectra resulting from whole cells, or protein extracts, are compared to reference spectra available in commercial databases, based, in particular, on clinical strains. The more similar the mass spectral patterns are, the closer to the phylogenetic relationships. Given the predominance of ribosomal and regulatory proteins, besides clinical diagnoses, these biomarkers are also useful for taxonomic studies of bacteria.
Using MALDI-TOF MS we generated spectra from 64 environmental AEFB samples isolated from Brazilian soils and quoted as SDF (Solo do Distrito Federal) strains . The predictive molecular relationship of protein profiling obtained for these environmental AEFB was further compared with classification based on the reference-method for taxonomic assignment of prokaryotes, the 16S rRNA gene sequencing.
SDF strains. Soil sampling and SDF strains isolation were described in Cavalcante, et al., . The 64 SDF strains used in this work (Tables 1-5) were randomly selected among SDF0001 to SDF0154, deposited at the Coleção de Bactérias aeróbias formadoras de endósporos (CBafes, or AEFB Collection), hosted at the University of Brasilia, Brazil.
Ethics statement. Specific permissions required for collection of bacterial strains used in this study were endorsed by the Federal Brazilian authority (CNPq; Authorization of Access and Sample of Genetic Patrimony nº 010439/2015-3). Sampling did not involve endangered or protected species.
MALDI-TOF MS. Using a 10µL plastic loop, cells from 4 single colonies per SDF strain, cultured in solid Luria-Bertani (28 °C/24-48h), were transferred to 4 microtubes containing 300μL of ultra-pure water (Mille-Q™), resulting in 4 different extractions and 4 different measurements for each strain. After vortex stirring, 900 μL of 100% ethanol was added, the suspension stirred again, and centrifuged at 12,000×g for 2min. Air-dried pellets were resuspended in 30µL of 70% formic acid and acetonitrile in the ratio of 1:1 (v/v). The final mixture was stirred and centrifuged at 12,000×g for 2min, and 1μL of the supernatants was transferred to a spot onto a 96-well stainless steel MALDI target plate. The matrix, prepared in an organic solvent mixture to a final concentration of 10mg mL.−1 in a 50:40:10 acetonitrile:water:3% Trifluoroacetic Acid (TFA) ration solution, was overlaid and allowed to dry. Each sample was spotted 4times. The mass spectra for the SDF strains were acquired (MicroFlex mass spectrometer; Bruker Daltonics, Bremen, Germany) at Embrapa Mass Spectrometry Laboratory (Brasilia, DF, Brazil). The spectra were recorded in the linear positive mode at a laser frequency of 60Hz within a mass range from m/z 2,000 to 20,000. For each spectrum, 240 laser shots in 40-shot steps from different positions of the target spot were collected and analysed. Spectra were externally calibrated employing Escherichia coli rProteins (Bruker Daltonics, Bremen, Germany). SDF strains spectra were loaded with the MALDI Biotyper software (Bruker Daltonics, Bremen, Germany) and analysed using the standard pattern-match algorithm, which compared the spectrum acquired to all inputs present in the manufacturer library. The results of the pattern-matching process were expressed as log values ranging from 0 to 3.000, according to the manufacturer instructions. Scores of <1.700 are interpreted as unreliable identification and of ≥1.700-1.999 and ≥2.000-2.99 indicate identification at the genus and species levels, respectively.
Taxonomic assignments of SDF strains. DNA preparation, PCR amplification, sequencing, and sequence analyses were performed as described in Orem et al. (2019) . Briefly, nearly full length of both strands of 16S rRNA genes was amplified using total DNA and primers 27F (5’ AGA GTT TGA TCM TGG CTC AG 3’) and 1492R (5’ GGY TAC CTT GTT ACG ACT T 3’). PCR products were bidirectionally sequenced by Sanger method and Phred scores of ≥20 used to assess quality of sequences. Taxonomic assignments of the sequences were performed using BLAST and Classifier. Both forward and reverse chromatograms of the sequenced 16S rDNA fragments were analysed by Chromas software (Technelysium Pty Ltd) to determine best quality regions. Consensus sequences (550-600 nucleotides) were created using BioEdit 7.2.6 software and deposited at NCBI (Tables 1-5 for accession numbers). Similarity of 95%-96% and ≥97% were considered as the threshold values for identification at the genus and species levels, respectively.
Fresh cells from 4 single colonies per SDF strain were used to obtain 4 different protein extractions. Each of these biological replicates was spotted and analysed 4times. The spectra were acquired by MALDI-TOF mass spectrometer (Bruker Daltonics: MicroFlex model) and recorded in the linear positive mode at a laser frequency of 60Hz within a mass range from m/z 2,000 to 20,000. For each spectrum, 240 laser shots in 40-shot steps from different positions of the target spot were collected and analysed. External calibration employed E. coli rProteins (Bruker Daltonics, Bremen, Germany). The spectra of 64 SDF strains analysed with the FlexAnalysis 3.3 and MALDI-Biotyper 3.0 programs (Bruker Daltonics) were used to identify and classify these AEFB according to resulting mass spectra.
This analysis revealed that for 33(51.56%) strains the score ranged from 1.700 to 1.999, thus identifying these SDF strains at genus level (Tables 1,2). The remaining 31(48.43%) presented log score values >2.000, which indicates species-level identification (Tables 3-5). Overall, genus Bacillus was predominant comprising 60(93.75%) strains, while the other 4(6.25%) strains were distributed among 3 genera: LysiniBacillus (1), BreviBacillus (2), and PaeniBacillus (1).
Amidst the 33(51.56%) strains identified at genus level by the MALDI Biotyper database (Tables 1,2), 16S rRNA sequence similarity-based analysis coincided in terms of identifying 30(90.90%) samples (Table 1). Considering the best match suggested by protein profiling, 15(50.00%) out of these 30 coincide with the Bacillus spp. discrimination obtained by the genotype method at species level, and 13(43.33%) belonged to the same close-related AEFB groups (Table 1).
Regarding the remaining 3(9.09%) strains, both identification methods yielded different genera (Table 2). MALDI-TOF MS-based analysis allocated SDF0063 and SDF0133 at the genera BreviBacillus and Bacillus, respectively. However, these two strains were classified at species level by 16S rRNA gene sequencing, with both presenting 99% of similarity with LysiniBacillus xylanilyticus and PaeniBacillus alvei, respectively. The third strain (SDF0066) was assigned to 2 different genera: Bacillus (score 1.915) and RummeliiBacillus (96% similarity) by MALDI-TOF MS and 16S rRNA gene sequencing, respectively.
Considering similarity ≥97% and 95-96% as the threshold values for identification at the species and genus levels, respectively , the 16S rRNA gene sequence analysis classified 60(93.75%) SDF strains at species (Tables 1-4) and 2(3.12%) at genus levels, respectively (Tables 2,5). The prevalent genus was Bacillus harbouring 56(87.50%) strains, followed by 4 additional genera: 2(3.12%) of LysiniBacillus and of PaeniBacillus, besides 1(1.56%) of BreviBacillus and of RummeliiBacillus. The sequence similarity-based approach failed to identify strains SDF0108 and SDF0139 at genus level, since the similarity in both cases was 94% (Tables 5,1, respectively; highlighted in grey).
When comparing performance of both methods, concerning the classification of SDF strains at species level, it was observed that 19 (61.29%) out of 31 SDF strains identified by MALDI-TOF MS presented 16S rRNA gene sequence similarity ≥97%, therefore, being also characterized at species level (Table 3). For strains identified at the species level by MALDI-TOF MS, strains SDF0014 and SDF0066, or 6.45%, presented similarity of 96%, consequently, being identified at the genus level by gene sequence similarity. In addition, other 10 out of 64(15.62%) strains identified at the species level by both methods were classified as different species in each case (Table 4).
Despite their phenotypic diversity, many species of AEFB share high genetic homogeneity. In 1991, Ash, Dorsch, & Stackebrandt sequenced the 16S rRNA gene of 51 standard strains, at that point defined as Bacillus spp., and showed that they can be segregated into several distinct phylogenetic groups. Two of these sequenced helped in the proposition of the novel genera PaeniBacillus and BreviBacillus . Along with other genera, these two taxa are now recognized to comprise a separate Bacillales family, designated PaeniBacillaceae. Likewise, based on clear-cut differences in discriminative taxonomic markers and the distant placement, B. pycnus is reclassified into a separate genus . According to current 16S rRNA gene sequence-based relatedness, the latter and strains from other species of this clade are presently allocated into the genus RummeliiBacillus. It is noteworthy that inside order Bacillales, genera Bacillus, LysiniBacillus, and the novel genus RummeliiBacillus all belong to the family Bacillaceae.
Members of B. cereus group share 99.5 to 100% of similarity for their 16S rRNA gene sequences [22,23]. The subgroup of B. pumilus belong to the B. subtilis complex and the species B. pumilus sensu stricto share 99%-100% of similarity with the species B. safensis, B. altitudinis, and B. amyloliquefaciens . Correspondingly, Bacillus megaterium/aryabhattai are also among many pairs of distinct taxa of AEFB that bear extreme close evolutionary relationship sharing 99.7% similarity of 16S rRNA sequences .
Currently, MALDI-TOF MS is well-established as a fast and reliable technique in clinical laboratories to identify microorganism species . However, application of this technique in other fields of microbiology, whose reference databases cover only a small portion of the vast range of microbial diversity, has been limited [26,27]. Even though, protein profiling has been found to be useful in discriminating many closed-related Bacillus sp. [28-33].
In this work, we compared MALDI-TOF MS analysis of 64 environmental AEFB to the standard 16S rRNA gene sequencing method for the identification and classification of AEFB isolated from Brazilian soils, designated SDF strains .
MALDI-TOF MS results were evaluated using cut-off scores ≥1.7000 to <1.999, and ≥2.000 for acceptable identification at genus and species levels, respectively, as suggested by the manufacturer. At genus level, the overall concordance between the two methods was 60(93,75%) SDF strains. Biotyper identified 33(51.56%) and 31(48.84%) strains at genus (Tables 1,2) and species (Tables 3-5) levels, respectively. Conversely, 16S rRNA gene sequencing approach identified all strains at species level, expect for 2(3.12%) that were identified only at genus level (Tables 2,5) and 2 others that could not be identified even at genus level (Tables 1,5). The genus Bacillus prevails comprising 60(93.75%) and 56(87.50%) strains classified by protein profiling and gene sequence similarity, respectively. MALDI-TOF MS-based analysis distributed the remaining 4(6.25%) strains among 3(4.68%) other genera: 2(3.12%) to BreviBacillus and 1(1.56%) to either PaeniBacillus and or LysiniBacillus. In contrast, the genotype method assigned 6 strains to 4 additional genera: 2(3.12%) to either LysiniBacillus or PaeniBacillus, besides 1(1.56%) to either BreviBacillus or RummeliiBacillus.
Decreasing the cut-off point for identification to a score of 1.700 had little effect on the overall classification, as the inclusion of SDF strains with a MALDI Biotyper score of ≥1.700 and <2.0 did not significantly affect the results obtained using the recommended score of ≥2.0 (Table 1). From the 30 out of 33 strains identified at genus level by MALDI-TOF MS (Table 1), the best match suggested coincided with 15(50.00%) strains discriminated at species level by the genotype method, and the 13(43.33%) discrepant belonged to the same close-related groups of the genus Bacillus, mostly belonging to the B. subtilis complex.
Although further studies would be required to accurately discriminate these species, this is a relevant guide towards the many different genetic clusters found inside genus bearing hundreds of species, as in the case of Bacillus consisting of almost 400 species (List of Prokaryotic names with Standing in Nomenclature: http://www.bacterio.net/index.html; retrieved 11 October 2019).
In this study, protein profiling and 16S rRNA gene sequencing were discordant in only 3 classifications at genus level (Table 2). Strains SDF0063 and SDF0133 were assigned to genera BreviBacillus and Bacillus, respectively, by the first technique. On the other hand, these strains were classified at species level by 16S rRNA gene sequencing, both presenting 99% of similarity with LysiniBacillus xylanilyticus and PaeniBacillus alvei, respectively. The strain SDF0066 was assigned to genus Bacillus (score 1.915) and to RummeliiBacillus (96% similarity) by MALDI-TOF MS and 16S rRNA gene sequencing, respectively. As for the results discussed above, these discrepancies are most likely due to the insufficient coverage of bacterial species in the databases. Indeed, at the time these analyses were performed, most environmental species studied here were underrepresented with one or few spectra in the reference library.
With respect to 31 out of 64 SDF strains which MALDI BioTyper identifications reached scores of >2.000 (species identification), 19(61.29%) were concordant (Table 3) and SDF0014 and SDF0108 (6.45%) were identified only at the genus level by 16S rRNA gene sequencing (Table 5). Nevertheless, the remaining 10(32.25%) strains were also identified at the species level by both methods (Table 4), although classified as different species of genus Bacillus. Interestingly, in this case, the results obtained by both techniques also pointed out to a pair of alternative species. Five strains (SDF0029, SDF0055, SDF0065, SDF0082, and SDF0086) were classified as B. megaterium and B. aryabhattai by MALDI-TOF MS and 16S rRNA gene sequences, respectively. Considering the high genetic similarity between these 2 species and that there was no representative strain of species B. aryabhattai in the BioTyper 3.0 library (Bruker Daltonics), these results were not surprising. Therefore, the availability of higher number of B. megaterium spectra and library entries belonging to B. aryabhattai might improve MALDI-TOF MS accuracy. Likewise, the other 10 strains (Table 4), allocated by both methods as either B. cereus (5) and B. pumilus (5) group strains are also scarcely represented in the library. Thus, at the species level, considering the various groups of closed-related AEFB, the overall concordance between the two methods was 29(45,31%) SDF strains.
Our classification based on 16S rRNA gene sequences is a preliminary determination of genera or species. Thus, when 16S rRNA gene profiling placed these strains within these Bacillus sp. groups, the sample analysed can belong to two or even more species alternatives within the same affiliation cluster. Therefore, in these instances the 16S rRNA gene sequencing can only identify these sets of bacteria but cannot assign it accurately to a certain species according to its low discrimination capacity. Even so, our assignments are useful because they clearly identify the genera and restrict the identity of SDF strains to one or two possible species in the genera described.
Since strains SDF0108 and SDF0139 presented similarity of 94%, the 16S rRNA gene-sequencing tool failed to classify both even at genus level (Tables 5,1, respectively; highlighted in grey). Though sometimes the use of these sequences as a single marker is not enough to delineate species, low gene sequence similarity may grant the first indication that a novel species could have been isolated . However, description of new species is beyond the scope of this study.
The results obtained here demonstrated that both techniques used for the identification of SDF strains had good resolution at the genus level. However, 16S rRNA gene sequences achieved superior capacity in identifying these environmental AEFB at the species level when compared with MALDI-TOF MS method. Both tools showed a lack of efficiency to discriminate closely related species. Nevertheless, this initial outline clarified the genetic interrelationships of these environmental strains. Hence, sequence similarity values and score ranges were complementary to each other and can help if comprehensive high-quality reference datasets are available.
Considering that in the present study less than 50% of the SDF strains were identified at the species level using MALDI-TOF MS, our results showed the importance of expanding the available spectrum libraries, since most spectra deposited in databases are of clinical source, thus presenting little information of other origins. Spectrum libraries of non-clinical samples may require special considerations concerning the clinical counterpart, because of the extent of stresses can be much more variable in the environment [35,36]. The stress-related proteins may lead to the misidentification of new isolates, since they may differ significantly from type strains in the proteotypic properties. Beyond this technical issue, the absence of public repositories for mass spectra may limit the use of MALDI-TOF MS, since existing libraries remain private and expensive to access.
This study also supports the need of using phenotypic along with genotypic methods into polyphasic approaches for taxonomic purposes of the diversity of AEFB.
We thank University of Brasilia, and the Brazilian research funding agencies Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (Capes) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq).
Subscribe to our articles alerts and stay tuned.