ABSTRACT
The genetic diversity and sharing of the mother-child associated microbiota remain largely unexplored. This severely limits our functional understanding of gut microbiota transmission patterns. The aim of our work was therefore to use a novel reduced metagenome sequencing in combination with shotgun and 16S rRNA gene sequencing to determine both the metagenome genetic diversity and the mother-to-child sharing of the microbiota. For a cohort of 17 mother-child pairs we found an increase of the collective metagenome size from about 100 Mbp for 4-day-old children to about 500 Mbp for mothers. The 4-day-old children shared 7% of the metagenome sequences with the mothers, while the metagenome sequence sharing was more than 30% among the mothers. We found 15 genomes shared across more than 50% of the mothers, of which 10 belonged to Clostridia. Only Bacteroides showed a direct mother-child association, with B. vulgatus being abundant in both 4-day-old children and mothers. In conclusion, our results support a common pool of gut bacteria that are transmitted from adults to infants, with most of the bacteria being transmitted at a stage after delivery.
INTRODUCTION
The colonization by gut bacteria at infancy is crucial for proper immune development and gut maturation (1). At birth, we are nearly sterile, while just after a few days of life we become densely colonized by bacteria (2). How and when we acquire the adult associated bacteria,are not yet completely established (2). Recent 16S rRNA gene sequence data suggest that most of the adult associated bacteria are recruited at a stage after delivery (3), while shotgun analyses suggest a high frequency of direct transmission during delivery (4). The limitations of these studies, however, are that 16S rRNA gene analyses do not have sufficient resolution to resolve mother to child transmission at the strain level, while shotgun sequencing requires extensive and complex analyses (4,5). Taken together, this restricts the possibility of gaining broad-scale knowledge about the microbiota genetic diversity and distribution with the current analytical approaches. There is thus a need for analytical approaches that combine efficiency and resolution.
The aim of the current work was therefore to use a novel concept of reduced metagenome sequencing (RMS; schematically outlined in Fig. 1) in combination with 16S rRNA gene and shotgun sequencing to estimate genetic diversity and mother-child overlap for gut associated bacteria for a medium size cohort of 17 mother-child pairs.
RMS requires a relatively shallow sequencing depth in order to gain insight into genetic diversity and microbiota overlap across individuals. With this approach only a defined fraction of the metagenome is sequenced. Principles related to reduced metagenome sequencing have been widely used for strain resolution analyses since the 1990s by DNA fragment size separation (6,7). RMS, however, gives additional information about fragment sequence, therefore thus enabling the estimation of total genetic diversity and overlap of metagenome sequences. Thus, RMS has the potential to solve some of the most urgent needs in current metagenome sequence analyses.
Our study presents evidence that less than 10% of the microbiota is directly transmitted from mother to child, while the sharing of more than 30% of the microbiota across random mothers suggests that the majority of the gut microbiota is transmitted at a stage after delivery.
MATERIALS AND METHODS
Cohort description
The study consists of an unselected longitudinal cohort of 17 mother-infant pairs. The infants were born full term at Nishanth Hospital, India. Twelve of the 17 infants were born through cesarean section. All the mothers that gave birth through vaginal or cesarean section were given antibiotics either during pregnancy, during labor and/or after pregnancy (Suppl. Table 1). Fecal samples were collected from late pregnant women (gestational age 32-36 weeks) and in infants 0-4 days after birth for all the mother child-pairs, while the numbers of samples at 15 days were 4, 60 days 3 and 120 days 3. Fecal samples were collected and stored at -20°C up to a week with STAR buffer (Roche, Basel, Switzerland). Then, the samples were transferred to -80°C for longer storage. One of the parents of each child signed a written informed consent form before the fecal sample collection, which is in accordance with legislation in India.
Mock community
Mock communities of E.coli ATCC25922; E. faecalis V583; B. longum DSM20219 and B. infantis DSM20088 mixed in varying proportions ranging from 0% to 100% were used to assess sensitivity of the AFLP sequencing in prediction and identification of bacterial strains.
DNA isolation
The fecal samples were diluted 3-fold with STAR-buffer and pre-centrifuged at 1200 rpm for 8 sec to remove large particles. An overnight culture of each of the four bacteria species used for mock community analyses was used for DNA isolation. Bacterial cultures were pre-centrifuged at 13000 rpm for 5 min, and then pellets were washed twice in 1x PBS buffer. The supernatant from the pre-centrifuged stool samples, as well as bacterial pellets in PBS, were mixed with acid-washed glass beads (Sigma-Aldrich, <106μm; 0.25g) and bead-beated at 1800rpm in 40 seconds twice, with 5 minutes’ rest between the runs. The samples were then centrifuged at 13000rpm for 5 minutes. An automated protocol based on paramagnetic particles (LGC Genomics, UK) was used for the DNA isolation using a KingFisher Flex (ThermoFisher Scientific, USA), following the manufacturers recommendations. After extraction, samples were quantified and normalized using Qubit fluorometer (ThermoFisher Scientific, USA). The DNA concentrations were normalized to 0.2 ng/μl prior to further processing. Mock communities were prepared to contain a total of 10 ng DNA in each sample.
Library preparation and sequencing
For RMS, DNA fragments were obtained by cutting genomic DNA using an enzyme combination of EcoRI and MseI, followed by an adapter ligation and PCR amplification. Restriction cutting was performed in 20 μl volumes containing 8U EcoRI (New England Biolabs, USA), 4U MseI (New England Biolabs, USA), 1x Cut smart buffer (New England Biolabs, USA), and 1 ng genomic DNA. The samples were incubated at 37°C for one hour to make sure that the restriction enzymes would cut appropriate amounts of DNA into fragments. For PCR amplification of the fragments, adapters were ligated onto the fragments. This was done by adding the sample a 5 μl volume of 0.5 μM EcoRI adapter mix, 5μM MseI adapter mix, 1 μl T4 DNA ligase (New England Biolabs, USA) and 1x T4 reaction buffer (New England Biolabs, USA). The adapter mixes were made of equal volumes of forward (EcoRI; 5’- CTCGTAGACTGCGTACC-3’, MseI; 5’- GACGATGAGTCCTGAG-3’) and reverse adapters (EcoRI; 5’- AATTGGTACGCAGTCTAC-3’, MseI; 5’- TACTCAGGACTCAT-3’). The adapters will ligate to the fragments that have been cut by both restriction enzymes. The samples were incubated for 3 hours at 37°C. PCR amplification was performed using primer pairs EcoRI (5’- GACTGCGTACCAATTC-3’)/MseI (5’- GATGAGTCCTGAGTAA-3’), targeting the RMS fragments, and PRK341F (5’- CCTACGGGRBGCASCAG-3’)/ PRK806R(5’- GGACTACYVGGGTATCTAAT-3’) (8),targeting the V3-V4 region of the 16 S rRNA gene. Each reaction contained 1x HotFirePol DNA polymerase Ready to load (Solis BioDyne, Estonia), 0.2μM forward and reverse primer (Invitrogen, USA) and 2 μl template DNA. The cycling conditions for the 16S rRNA gene were 25 cycles of 95°C for 30 seconds, 55°C for 30 seconds and 72°C for 45 seconds, while the cycling conditions for the RMS were 25 cycles of 95°C for 30 sec, 56°C for 1 min and 72°C for 1 min. The PCR products were purified by AMPure XP beads (Beckman Coulter, USA) prior to further processing.
To index the fragments 1x FirePol DNA polymerase Ready to load (Solis BioDyne, Estonia), 0.2μM forward index primer and reverse index primer and 2μl purified PCR products were used. The fragments were amplified by PCR using the following thermal cycle: 95° C for 5 minutes, followed by 10 cycles of 95° C for 30 seconds, 55° C for 60 seconds and 72° C for 45 seconds. A final step at 72° C for 7 minutes was included. All samples were pooled using 100ng DNA from each sample. The pooled sample were then purified using AMPure XP beads prior to sequencing. The pooled sequencing library comprised 15% PhiX and 85% pooled sample.
The shotgun metagenome sequencing was done as previously described using Illumina Nextera XT, following the recommendations by the producer (9).
Data analysis
Raw data of 16S were analyzed using a standard workflow of QIIME pipeline (3). Sequences were paired-end joined (join_paired_ends.py with fastq_join method), demultiplexed using split_library.py script with no error in the barcode allowed (max_barcode_errors 0) and barcodes removed. Sequences were then filtered using fastq_filter command of usearch (maxee = 0.22; minlen = 350). Finally, sequences were clustered at 97 % similarity threshold using cluster_otus command of usearch. Singletons were removed and an additional reference-based chimera removal step against GOLD database was performed. Resulting dataset was then rarefied to 6000 sequences per sample (Schematically outlined in Suppl. Fig 1A). The reduced metagenome and shotgun data were analyzed using a CLC Genomic Workbench (Qiagen, Hilden, Germany), using the paired-end sequence-merging tool, de novo and reference based assembly tools, and Blast searches (Schematically outlined in Suppl. Fig. 1B). The shotgun data were not processed further after de novo assembly, while for the RMS analyses we followed the bioinformatics workflow, as outlined in Fig. 2. Sample comparisons and statistical analyses were done using Matlab 2016a (Mathworks Inc, USA) with the PLS toolbox module (Eigenvector Research Inc., USA).
RESULTS
Metagenome sequence size
We identified 156 926 unique RMS fragments for the samples analyzed (Fig. 2). This corresponds to complete metagenome sequence size of approximately 750 Mbp given RMS fragment sizes of each 5000 bp. We found an approximately 5-fold increase in the collective metagenome sequence size when comparing 4-day-old children with the mothers (Fig. 3A). The collective metagenome sequence for the 4-day-old children was estimated to approximately 100 Mbp with 40±6.5 Mbp per individual, while the estimated collective metagenome sequence size for mothers was 500 Mbp – with individual sizes of 100 ±7 Mbp.
The total number of 16S rRNA gene derived OTUs was 458. For the 4-day-old children we detected a total of 40 OTUs with mean relative abundance > 1 ‰ (13.1±4 per individual), while the number increased to 140 OTUs for the mothers (70.2±7.5 per individual) (Fig. 3B).
For the shotgun analyses we generated 8.6 million paired end reads with a total size 2 070 Mbp shotgun metagenome sequence data for the 4-day-old children, and 6.7 million reads with a total of 1 3964 Mbp sequence for the mothers. However, the shotgun sequences only generated 15.8 Mbp assembly for the 4-day-old children, and 4.3 Mbp for the mothers (Suppl. Table 2).
Associations of microbiome with mode of delivery and antibiotic usage
For the 4-day-old children we found 27 reduced metagenome sequencing fragments unique to children delivered by c-section, while 20 fragments were unique to the children delivered vaginally (Suppl. Table 3). The most pronounced differences were an overrepresentation of fragments related to the genus Bacteroides for vaginal delivery (p=0.0038, Binominal test).
Based on ResFinder assignments (10) of the RNS fragments, we identified 13 fragments associated with known antibiotic resistance genes (Suppl. Table 4). There was a clear association between antibiotic usage during labor and antibiotic resistance genes (p=0.001, ASCA-ANOVA), with Fosfomycin, Beta-lactam and Phenicol resistance showing positive associations (Suppl. Fig 4). Antibiotic usage during pregnancy or after delivery, however, did not seem to affect resistance gene composition (results not shown).
For the 16S rRNA gene sequence data we did not identify any significant association of OTUs with mode of delivery by ASCA-ANOVA. Furthermore, no significant association between 16S rRNA gene sequence data and antibiotic usage was determined.
Vertical transmission
About 7% of the fragments detected by RMS for the 4-day-old children were shared with the mothers, with no difference between vaginally or c-section delivered children (p=0.67, Kruskal-Wallis test). Furthermore, there were no significant differences if the sharing was with the same or different mother for any of the age categories, although there was a tendency towards increased association with the same mother with age (Fig. 4A).
Similarly, as for RMS we did not identify any significant differences between the same or different mothers with respect to OTU sharing. However, the 16S rRNA gene OTUs displayed a pattern different from the reduced metagenome sequencing fragment sharing, with a peak in sharing at 15 days (Fig. 4B).
Sharing within age categories
For the reduced metagenome sequencing, we found that the average sharing of fragments between individuals increased from below 15% for 4-day-old children to more than 30% for the mothers, with the 15-day to 4-month samples showing intermediate levels (Fig. 5A).
The age-related differences were less pronounced for the 16S rRNA gene sequence data, with an increase from 30% at 4 days to about 50% for the mothers. The 15-day to 4-month samples showed relatively large fluctuations for the shared 16S rRNA gene OTUs (Fig. 5B).
Identification of core genomes
By RMS we identified the mothers’ core genomes, and the relative abundance of these genomes in infants. We first identified the RMS fragments shared across more than half of the mothers. In total, 10 009 RMS fragments satisfied this criterion (Fig. 2). From these, we identified 15 genome-sequenced species with more than 97% identity to the core fragments by Blast search (Fig. 2). The prevalence of these genomes in both mothers and 4-day-old children were determined by mapping all the RMS fragments, using the core genomes as reference. In mothers we identified 5 core genomes with a relative abundance above 1%, while for children only Bacteroides vulgatus showed high relative abundance (Fig. 6).
Validation of RMS
We validated RMS on experimental communities with known composition. This validation showed that there were clear signatures separating the bacteria in mixed populations, even between closely related Bifidobacteria (Suppl. Fig. 2A). Next we determined the quantitative potential of the RMS approach. This was done through regression analyses between the expected and observed DNA quantity of the four bacteria in the experimental community. All four evaluated bacteria showed high correlations (R2 > 0.8) between estimated concentrations based on RMS fragment frequency, and the expected concentrations (Suppl. Fig. 2B).
Finally, we determined the frequency of the reduced metagenome sequencing fragments from assembled shotgun data. This comparison showed a high correlation between the contig size and number of RMS fragments mapping to the respective contigs (Suppl. Fig. 3), with the mean distance between the RMS fragments being 5513±187 bp.
DISCUSSION
There was a consistent increase in bacterial species shared between mothers and children with age based on the RMS data, while 16S rRNA gene analyses suggested a less consistent age-related pattern. This could be due to the fact that 16S rRNA gene analyses may merge several strains into the same OTUs (11), obscuring the analyses. For the mothers, the shotgun sequencing was far too shallow to yield any reasonable estimate of metagenome sequence size or strain composition, illustrating the shotgun sequencing challenges. The current approaches to extract strain level information from shotgun would require very deep sequencing (12). Therefore, we believe that the RMS approach can be a valuable and cost efficient contribution in deducing patterns associated with the gut microbiota.
Our RMS results support direct mother to child transmission of less than 10 % of stool-associated bacteria. Although the sharing was slightly higher with the child’s own mother rather than a random mother, most of the bacteria seem to be recruited from a common pool of gut associated bacteria. This contrasts with recent findings that suggest high strain sharing between infants and their mothers (4). However, given that more than one-third of the strains are shared across mothers, it would be difficult to determine if a strain is transmitted to a child from his or her mother or from some other adult. From the taxonomic identity, however, fragments belonging to the genus Bacteroides seemed underrepresented for children delivered by cesarean section. This is consistent with previous observations with long-term underrepresentation of Bacteroides in c-section delivered babies (13). The very high relative abundance of B. vulgatus in children could indicate that this bacterium plays an important role in the early development of the gut microbiota. B. vulgatus is mucin degrading bacterium (14) interacting with Escherichia coli in inflammation induction (15), and is suppressed by Bifidobacteria (16). To our knowledge, however, no studies have yet addressed the role of B. vulgatus in infants.
In our study we found very low levels of clostridia for the 4-day-old children, in addition to a lack of direct mother-child associations. Therefore, we found it unlikely that most of the adult associated clostridia are transmitted at delivery. Recently, it has been found that a large portion of gut bacteria are spore-formers (17), with endospores as a potential vector for transmission at a later stage than delivery (18).
Previous 16S rRNA gene sequencing have shown high degree of sharing at the genus/species level across mothers (3,19). Thus, the increased resolution of the reduced metagenome sequencing further supports the sharing of a relatively limited number of bacterial species/lineages within human populations. Our results suggest that one-third of the fragments are shared across random mothers. The mapping of the core fragments identified among the Indian mothers to human-derived genome sequenced isolates (mostly from Europe and America) further support the fact that there are limited number of human gut associated bacteria, and that these have wide geographic distribution. Interestingly, Ruminococcus bromii, which was among the most prevalent and dominant species for the Indian mothers, has previously been identified as a keystone species in resistant starch degradation, supporting the growth of both Eubacterium rectale and Bifidobacterium adolescentis (20), which were all identified among the 15 bacterial species shared across more than half of the Indian mothers in our work. This suggests that the core bacteria could have biologically important interactions.
Antibiotic usage during labor seemed to have a major impact on the resistance genes in the children without impacting the overall microbiota composition. This is consistent with previous observations suggesting that the mobilome can evolve independently of the overall composition of the gut microbiota (3). Furthermore, we detected resistance associations for antibiotics other than those used, indicating potential antibiotic resistance linkage (21). Thus, antibiotic usage during labor could be a major contributing factor to antibiotic resistance spread within the infants’ commensal gut microbiota.
CONCLUSION
In conclusion, our results support a model with late recruitment of adult gut associated bacteria in infants, with a more than five-fold increase in genetic richness from child to adult.
DECLARATIONS
Availability of data and material
The raw sequencing reads are deposited in the European Nucleotide Archive with the accession number PRJEB85416, while the data used for figure generation are provided in a Supplementary Excel file.
Ethics approval and consent to participate
A written consent was obtained from all the participants
Consent for publication
Not applicable.
Funding
The project was funded by the Norwegian University of Life Sciences and the Norwegian Government.
Competing interests
There are no competing interests.
Author’s contributions
AR designed the study. EA did the methods validation. IA performed the analyses. JL did the shotgun analyses. PM, SP and RN did the sample collection and recording of metadata. KR analyzed the data, wrote the paper and invented the RMS methods. All authors commented on the manuscript.
Acknowledgements
We would like to thank the Norwegian government for the scholarship provided to Anuradha Ravi. We would also like to thank the doctors and nurses at Nishanth Hospital, India for doing the sampling and collecting the information.
Footnotes
Anuradha.Ravi{at}nmbu.no, Ekaterina.Avershina{at}nmbu.no, Inga.Angell{at}nmbu.no, Jane.Ludvigsen{at}nmbu.no, prasanth.emsa1{at}gmail.com, mpnsumathi{at}gmail.com, drpnramesh{at}gmail.com, Knut.Rudi{at}nmbu.no