Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

GBS and a newly developed mRNA-GBS approach to link population genetic and transcriptome analyses reveal pattern differences between sites and treatments in red clover (Trifolium pratense L.)

View ORCID ProfileB Gemeinholzer, O Rupp, View ORCID ProfileA Becker, M. Strickert, C-M Müller
doi: https://doi.org/10.1101/2021.11.30.470612
B Gemeinholzer
aUniversity Kassel, Botany, Heinrich-Plett-Strasse 40, D-34132 Kassel, Germany
eSystematic Botany, Justus-Liebig-University Giessen, Heinrich-Buff-Ring 38, D-35392 Giessen, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for B Gemeinholzer
O Rupp
bBioinformatics and Systems Biology, Justus-Liebig-University Giessen, Heinrich-Buff-Ring 58, D-35392 Giessen, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
A Becker
cEvolutionary Developmental Biology of Plants, Justus-Liebig-University Giessen, Heinrich-Buff-Ring 38, D-35392 Giessen, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for A Becker
M. Strickert
dII. Physikalisches Institut, Justus-Liebig-University Giessen, Heinrich-Buff-Ring 1, D 35392 Gießen
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
C-M Müller
eSystematic Botany, Justus-Liebig-University Giessen, Heinrich-Buff-Ring 38, D-35392 Giessen, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

The important worldwide forage crop red clover (Trifolium pratense L.) is widely cultivated as cattle feed and for soil improvement. Wild populations and landraces have great natural diversity that could be used to improve cultivated red clover. However, to date, there is still insufficient knowledge about the natural genetic and phenotypic diversity of the species. Here, we developed a low-cost transcriptome analysis (mRNA-GBS) with reduced complexity and compared the results with population genetic (GBS) and previously published mRNA-Seq data, to assess whether analysis of intraspecific variation within and between populations and transcriptome responses is possible simultaneously. The mRNA-GBS approach was successful. SNP analyses from the mRNA-GBS approach revealed comparable patterns to the GBS results, but it was not possible to link transcriptome analyses with reduced complexity and sequencing depth to previously published greenhouse and field expression studies. The use of short sequences upstream of the poly(A) tail of mRNA to reduce complexity are promising approaches that combine population genetics and expression profiling to analyze many individuals with trait differences simultaneously and cost-effectively, even in non-model species. Our mRNA-GBS approach revealed too many additional short mRNA sequences, hampering sequence alignment depth and SNP recovery. Optimizations are being discussed. Nevertheless, our study design across different regions in Germany was also challenging as the use of differential expression analyses with reduced complexity, in which mRNA is fragmented at specific sites rather than randomly, is most likely counteracted under natural conditions by highly complex plant reactions at low sequencing depth.

Introduction

Trifolium pratense (red clover) is an economically relevant crop in temperate agriculture, and also a major component of sustainable farming. T. pratense has a high protein content and serves as livestock fodder, promotes soil fertility and is an important component of crop rotation systems. Red clover is well known for its high biomass production and good re-growth capability after mowing (Kleen et al. 2011, Dewhurst 2013, Eriksen et al. 2014, Herbert et al. 2018). The species belongs to the Fabaceae (legumes) which encompasses several other agronomical important crops, like Glycine max (soy), Medicago truncatula (barrel clover), Phaseolus vulgaris (common bean), Vigna unguiculata (cowpea). T. pratense is diploid, which is important for high throughput molecular and functional analyses.

Agriculture is faced with the challenge of continuously optimizing crops in order to adapt them to changing climatic and cultivation conditions and to meet the steadily increasing demand for animal feed. In red clover, there is still high potential for breeding optimization, as in wild populations as well as in germplasm collections there exists a highly significant morphological and genetic variation (e.g. Dias et al. 2008, Kölliker et al 2003, Smith et al. 1985). The natural variability of the species, which is native to northwest Africa, throughout Europe, and much of Asia and has been introduced to North America, South America, Australia, and New Zealand, can be used in breeding programs to identify promising populations for improving agronomically important traits (e.g., plant size, growth habit, leaf area (Herbert et al. 2018), inflorescence size, number of inflorescences, flowering, disease susceptibility, and others (Isobe et al. 2009, Eriksen et al. 2014, Yates et al. 2014, deVega et al. 2015). This might especially be relevant in times of fast climatic and anthropogenic changes.

Today, rapidly evolving new NGS techniques, tools and analytical methods of genome and transcriptome sequencing, their statistical analysis and related informatics offer new opportunities to support agricultural breeding programs with genomic information. This allows for fostering knowledge in complex biological systems at various organizational levels (from individuals to populations, e.g. Wisecaver et al., 2017; Li et al., 2020), in different dimensions of time and space (Joly & Faure 2015 Gould et al., 2018; Mead et al., 2019; Marx et al. 2020) and under different treatments, greenhouse conditions, or in the field (Herbert et al., 2021). The development of the RNA-Seq method for quantitative next-generation sequencing of expressed genes has made expression studies for non-model species feasible. However, the method remains expensive and often requires a high number of replicates, so scalability is often not straightforward (Lohman et al., 2016). Genomic DNA fingerprinting (e.g., ddRadSeq; Hohenlohe et al. 2011) or genotyping by sequencing (GBS; Elshire et al. 2011)) is now widely used to perform association studies in many species, including those with complex genomes (Caballero et al. 2021), for revealing genetic diversity and population structure (Müller et al. 2019), for fingerprinting germplasm resources (Wang et al. 2021), or for the detection of candidate genes by fine mapping, especially for improving plant breeding strategies (e.g. Purugganan & Jackson 2021).

Here, we tested whether we can bridge the gap between genomic DNA fingerprinting and reduced complexity functional genomics in such a way that the natural diversity of a species can be studied quickly and inexpensively, so that the data can be linked relatively easily to functional analyses suitable for improving breeding programs. To achieve this, we developed an reduced complexity mRNA-GBS approach. We tested our mRNA-GBS approach on natural populations of red clover in three regions of the Biodiversity Exploratory sites in Germany, and evaluated how it correlates with genomic diversity of populations (analyzed with GBS) over a geographic range and to an earlier published gene expression profiling approach (mRNA-Seq, Herbert et al. 2021). Herbert et al. (2021) examined the expression patterns of red clover in relation to species-specific responses to mowing at one of the Biodiversity Exploratories and in the greenhouse. They identified candidate genes whose annotation suggests potential importance for phenotype changes in response to mowing. However, these analyses are currently only possible for a limited number of sites and individuals due to high costs and immense amounts of data (Gould et al. 2018; Marx et al. 2020). By combining fingerprinting with transcriptome profiling techniques across many samples, treatments, and locations, we test here whether it is possible to detect multiple genetic variants found across taxa and genomes in wild populations of red clover. Furthermore, we test whether this approach is able to simultaneously identify genomic population differences and candidate gene-signals potentially indicative for adaptive genetic variation. Our goal was to assess whether mRNA-GBS provides results that are equitable and relatable to GBS and RNA-Seq, are biologically informative, and are more cost-effective due to the shallow sequencing depth.

Materials & Methods

Study site and sampling

Sampling of plant material for mRNA-GBS and GBS was performed on the premises of the long-term open research platform Biodiversity Exploratory in June 2017 on the three Biodiversity Exploratories “Schorfheide-Chorin (S)” in the State of Brandenburg, “Hainich-Dün (H)” in Thuringia, and “Swabian Alb (A)” in the State of Baden-Württemberg, Germany (Fischer et al. 2010) at six field sites each (Table 1, Fig. 1). One population (AEG9) deviated so much from the other populations in its values and patterns that it was excluded from further analyses in the mRNA-GBS as well as in the GBS analysis. The experimental plots were managed as normal agricultural land colonized with native, established red clover populations. The not-mown pastures and meadows were neither grazed nor mown in the year of sampling (Herbert et al. 2021). Collection permits from farmers and local authorities were obtained centrally by the Biodiversity Exploratory research platform. At least seven individuals per site (126 in total) were quick-frozen in liquid nitrogen in the field and stored at - 80°C until further processing.

Fig. 1
  • Download figure
  • Open in new tab
Fig. 1

Study sites in Germany of the three Biodiversity Exploratory sites (S: Schorfheide-Chorin; H: Hainich-Dün; A: Swabian Alb) with 6 sampled populations per site (three mown (transparent colors) and three unmown (rich colors)) for the mRNA-GBS analysis and the GBS analysis and results were compared to the RNA-Seq-study of Herbert et al. (2020) where samples derived from the Hainich-Dün site directly and were cultivated in a greenhouse experiment.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1

Study sites

Molecular techniques

Briefly, our mRNA-GBS library construction method involves 8 laboratory steps: (i) isolate total RNA, (ii) remove genomic DNA with DNase, (iii) convert mRNA into cDNA by using a reverse transcription kit (cDNA) using a BceA restriction sites containing PolyA primer with an anchor, (iv) digestion with BceA and MseI restriction enzymes, (v) NGS primer ligation with BceA adapter and index and MseI adapter, (vi) pooling, purification and PCR amplification, (vii) size selection, (viii) Illumina Next Seq 500 Vs sequencing (Fig. 2).

Fig. 2
  • Download figure
  • Open in new tab
Fig. 2

Laboratory and data analysis workflow

For the mRNA-GBS analysis seven individuals per site were examined. For RNA extraction we used the NucleoSpin® RNA Plant kit (Macherey-Nagel, Germany) according to the manufacturer’s instructions. For the mRNA-GBS development the Maxima H Minus Double-Stranded cDNA Synthesis Kit (Thermo Scientific™, Germany) was used for double stranded cDNA-Synthesis, however, with a specially designed PolyT priming site, suitable to be cleaved by the BceAI restriction enzyme (gcBceAI-PolyA-TVN-Primer: 5’-CCGGCGCGACGGCTTTTTTTTTTTTTTTTVN-3’) following the user manual. Purification took place with the NucleoSpin® Gel and PCR Clean-up kit (Macherey-Nagel, Germany). Restriction was carried out, by digesting 200 ng double stranded cDNA with BceAI (2 U/μl) and MseI (10 U/μl) by 37°C in NEB 3.1 buffer (16μl cDNA/H2O (200ng cDNA), 2μl buffer, 1μl BceAI, 0.25μl MseI and 0.75μl H2O.60 min incubation at 37°C and 20 min inactivation at 65°C). After preparing the samples 30ng/μl of the digested material were transferred to LGC Genomics GmbH (Germany) for library preparation, pooling and sequencing (150 bp paired-end reads on an Illumina Next Seq 500 V2, Fig. 2).

For GBS analysis, DNA was extracted from five samples per site (Table 1). We used the Invisorb® Spin Plant Mini Kit from Stratec Molecular (Germany) according to the instructions for use. DNA quantity and quality were analysed using a NanoPhotometer™ (Implen GmBH, München, Germany). We sent 300ng of DNA in 20μl to LGC Genomics GmbH (Germany) where genomic DNA were digested with 1 Unit MslI (NEB) in 1 times NEB4 buffer in 30 μl volume for 1 h at 37 °C. The restriction enzyme was heat inactivated by incubation at 80 °C for 20 min. The indexed Illumina libraries were prepared by using the Encore Rapid Multiplex System (Nugen): 15 μl were transferred to a new 96 well PCR plate, mixed on ice first with 3 μl of one of the 192 L2 Ligation Adaptors and then with a 12 μl Mastermix (a combination of 4.6 μl D1 water/ 6 μl L1 Ligation Buffer Mix/ 1.5 μl L3 Ligation Enzyme Mix). Ligation reactions were incubated at 25 °C for 15 min and heat inactivated at 65 °C for 10 min. A 20 μl Final Repair Master Mix was added to each tube and the reaction was incubated at 72 °C for 3 min. For purification, the reactions were diluted with 50 μl TE 10/50 (10mM Tris/HCl, 50mM EDTA, pH: 8.0) and mixed with 80 μl Agencourt XP beads, incubated for 10 min at RT and placed for 5 min on a magnet to collect the beads. The supernatant was discarded and the beads were washed two times with 200 μl 80% Ethanol. Beads were air dried for 10 min and libraries were eluted in 20 μl Tris Buffer (5mM Tris/HCl pH:9) prior to sequencing on an Illumina NextSeq 500 V2, resulting in 150 bp paired-end reads.

Bioinformatics and Genotyping

mRNA-GBS data SNP calling

The Illumina reads were mapped to the repeat-masked T. pratense reference genome (version GCA_900079335.1, ENSEMBL release 50) using the STAR short read mapper (Dobin and Gingeras, 2015). Duplicate reads were filtered using the Picard Toolkit (Broad Institute, 2019) MarkDuplicates algorithm (version 2.26.1). The samples of the same field site were pooled to get a higher resolution. Alleles were counted using bam-readcount (The McDonnell Genome Institute, 2021) with a minimum base quality of 20. Only loci with at least ten reads in each pool were considered and alleles were called only when supported by at least three reads. Error rates with TPM normalized read-counts were calculated using the following pipeline: http://rseqc.sourceforge.net/#rpkm-saturation-py

GBS data analysis

after base calling and demultiplexing the quality of the sequenced reads were quality checked. SNP calling and genotyping was conducted with Freebayes (Garrison and Marth 2021). We used adapter clipped data for further calculations in Stacks 1.48 (Catchen et al. 2011; Catchen et al. 2013). UStacks and denovo_map were applied for analyses without a reference genome. The following (default) parameters for the formation of stacks and loci were used: minimum depth of coverage to create a stack –m = 3, maximum of distance allowed between stacks –M = 2, distance allowed between catalog loci –n = 0, (maximum distance allowed to align secondary reads –N = 4, maximum number of stacks allowed per de novo locus: 3) and –t to remove or break up highly repetitive RAD-Tags in UStacks. Next we ran CStacks (to build the catalog) and SStacks (match the samples against the catalog) pipelines without modifications. We applied the correction module rxstacks, filtering by locus log likelihood with the following options: –t 40 --conf_lim 0.25 --prune_haplo --model_type bounded --bound_high 0.1 --lnl_lim −8.0 --lnl_dist –verbose. Finally, we ran the population program in Stacks with following parameters for: –r = 0.75. PGDSPIDER v.2.1.0.0 (Lischer and Excoffier 2012) was used to convert Stacks output files for further analyses.

Genetic diversity was estimated as percentage of polymorphic loci (PL) and as Nei’s gene diversity (He; Nei (1973)) using ARLEQUIN v.3.5.1. (Excoffier and Lischer 2010) and the package “diveRsity” (Keenan et al. 2013) by using R 3.5.1 (R Core Team 2013). To visualize the data STRUCTURE (Pritchard et al. 2000) was used, which shows the membership probabilities. For automation and parallelization of STRUCTURE (Pritchard et al. 2000) analysis we used the program StrAuto (Chhatre and Emerson 2017). Genetic clusters were detected by applying the admixture model, with 1000 Markov Chain Monte Carlo (MCMC) replicates, with a burn-in period of 1000 and ten repeats per run for each chosen cluster number (i.e. K = 1 – 20), Ploidy = 2. For all other settings, default options were used. To identify the most likely K modal distribution, delta K (Evanno et al. 2005) was determined by using STRUCTURE HARVESTER (Earl and von Holdt 2012) wich is also integrated in StrAuto (Chhatre and Emerson 2017). To verify the most probable cluster membership coefficient among the ten runs of STRUCTURE and STRUCTURE HARVESTER we used CLUMPP v.1.1.2 (Jakobsson and Rosenberg 2007). Corresponding graphs were constructed with DISTRUCT (Rosenberg 2004). By using R 3.5.1 (R Core Team 2013) and the R package ‘adegenet’ v.1.4-2 (Jombart 2008) a Principal Component Analysis (PCA) was calculated. With the R package ‘adegenet’ v.1.4-2 (Jombart 2008) and ‘ape’ (Paradis et al. 2004) the dendrograms were calculated, euclidian distance was used. Genetic variation among groups of populations (FCT), among populations within groups (FSC) and within populations (FST) were partitioned with hierarchical analyses of molecular variance (AMOVA) by using ARLEQUIN v.3.5.1.2 (Excoffier and Lischer 2010) with an allowed missing data level at 5 %. Additionally, pairwise FST values were estimated among populations, with significance levels of 0.05 and 100 permutations.

Results

Sampling and genotyping

The mRNA-GBS sequencing yielded a total of 183.747.290 reads for the 126 investigated samples, with 42 individuals per region (S, H, A; Table 2). Retrieved read numbers varied strongly between individuals with an average of 1.1 million raw reads per sample (range: 7.106.704 – 31.481). After applying different filtering steps, 91.870.548 adapter clipped read pairs were retrieved. To analyze error rates, we calculated TPM-normalized read counts for each sample (Fig. 3) by testing our mRNA-GBS library against the RNA-Seq library of Herbert et al. (2021). Since TPM normalizes to sequencing depth, the value should be stable with respect to the actual read count if the sequencing depth was appropriate. When we reduced our samples from 90% to 60% sequencing depth (Fig. 6), the changes in error rate indicated that our sequencing depth was insufficient to perform gene expression studies and to be matched against the T. pratense transcriptome (Herbert et al., 2021) for subsequent analysis, whereas the error rate in Herbert et al. (2021) was stable and in line with expectations.

Fig. 3

Error rates for the TPM normalized read counts for the samples of the mRNA-GBS analysis, depicted in light green (S), bluish green (H) and purple (A) and the RNA-Seq data of Herbert et al. (2020) in red, calculated with 90% coverage (upper left), 80% coverage (upper right), 70% (lower left), 60% (lower right) an revealing strong differences in the error rate detection in the mRNA-GBS samples, when coverage is reduced, with little differences in the RNA-Seq data, which is stable and thus usable for gene expression analysis.

To identify SNPs for population genetic studies, the sequencing depth for SNP analysis of individual samples was also too shallow. Therefore, individuals within sites of similar treatments (mown/not mown) were combined in bulk samples to obtain a site-specific pattern. In this way, a total of 15.111 SNPs were obtained for subsequent analysis.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2:

Number of raw reads retained in mRNA-GBS and GBS analysis after each filtering step for Trifolium pratense samples from the three Biodiversity Exploratory sites in Germany (S: Schorfheide-Chorin; H: Hainich-Dün; A: Swabian Alb).

The GBS sequencing yielded a total of 296.844.208 raw reads (range: 2.212.232 – 777.242) for the 90 investigated samples from the three regions each (Table 1, Figure 1), on average 3.6 million reads per sample. After applying different filtering steps, 56.395 SNPs were obtained for subsequent analyses, which is an 3.7 times higher coverage than received via the mRNA-GBS analysis.

The mRNA-GBS analysis revealed a comparatively high mean genetic diversity of the investigated red clover bulk samples of ØHe = 0.76, ranging from He = 0.72 (S) to He = 0.82 (A, Table 3), if the regions are to be considered. The genetic diversity is higher, if sites with treatments (mown/not mown) are to be considered ØHe = 0.82, ranging from He = 0.79 (S mown) to He = 0.86 (A not mown, Table 3). Because the analysis included multiple combined individuals from three populations per site and only two sites per region, the population comparison was too low to calculate genetic diversity among regions. The GBS analysis revealed a significantly lower mean genetic diversity of the investigated red clover populations of ØHe = 0.060, ranging from He = 0.049 (AEG31, AEG24) to He = 0.060 (HEG8, HEG13, Table 3). The region specific mean genetic diversity is lowest in A (ØHe = 0.050), intermediate in S (ØHe = 0.055) and highest in H (ØHe = 0.058). According to the ANOVA, genetic diversity among the three regions differed significantly (ANOVA F = 9.255 P = 0.009). Tukey test showed a significant difference between A – H (P = 0.007) but not between H - S (P = 0.470) and A - S (P = 0.139). The ANOVA with polymorphic loci only revealed no differences between A, H and S (F = 2.731, P = 0.0997). The AMOVA revealed moderate genetic differentiation among regions (FCT = 0.05) and within populations (FST = 0.07) which are highly significant. However, for among populations within regions the genetic differentiation is negligible (FSC = 0.02, Table 3). Thus, differentiation within populations were greater than among regions. Pairwise population FST estimates for the entire study area indicates low genetic differentiation among populations (0.00 - 0.013, Figure X). Pairwise population differentiation within regions is low to negligible for all regions (Ø A FST = 0.01, Ø S FST = 0.022, Ø H FST = 0.016).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3:

Population genetic statistics

STRUCTURE analyses based on the BIC and Bayesian clustering approaches revealed two genetic clusters, the proportional cluster membership of each being almost region-specific in the GBS analysis (Fig. 4A). The mRNA-GBS approach resulted in similar trends that were less prominent (Fig. 4B). This is also confirmed by the PCA (Fig. 5), which shows the respective site specificity of the centroids of all individuals (GBS) or bulk samples (mRNA-GBS) belonging to one sampling region, however, with much greater genetic similarity between individuals from S and H and the greater distance from A in the GBS analysis and more overlap in the mRNA-GBS data. This overlap is partly due to mowing treatment: the mown populations in the mRNA-GBS analysis showed a stronger pattern of site specificity, while the mRNA-GBS pattern of the unmown individuals was highly divergent. The GBS Neighbor Joining tree (Fig. 6A) reflects the patterns of the AMOVA, PCA, and STRUCTURE analyses, with individuals from A distinctly different from those from H and S, with some minor overlap between H and S among the individuals considered. The mRNA-GBS tree (Fig. 6B) also reflects the separate positions of the populations in A, but shows more mixing between H and S. The not mown populations A (AEG31, AEG14, Fig. 6B), and two out of three of the not mown populations in S (SEGHG, SEGz1) are also clustered, but lack a clear pattern as several other not mown populations appear scattered in the tree (SEGz2, HEG17, HEG8, HEG50).

Fig. 4
  • Download figure
  • Open in new tab
Fig. 4

Population genetic structure of the investigated red clover individuals (GBS) or site specific bulk samples (mRNA-GBS) across the different Biodiversity Exploratories (S: Schorfheide-Chorin; H: Hainich-Dün; A: Swabian Alb) as revealed by the STRUCTURE analyses and ΔK (Evano et al. 2005). A: for the GBS data where each column represents individuals within one region; B: for mRNA-GBS data, where each column represents the bulk samples within one population.

Fig. 5
  • Download figure
  • Open in new tab
Fig. 5

Principal Component Analysis (PCA) of genetic distances between individuals (GBS) or site specific bulk samples (mRNA-GBS) of Trifolium pratense across the different Biodiversity Exploratories (S: Schorfheide-Chorin; H: Hainich-Dün; A: Swabian Alb). Colored label positions represent the centroids of all individuals belonging to one sampling region for A: the GBS analysis, depicting colour coded individuals within each region, where the third axis is representing 1.85% of genetic variation (Σ 10.38%) and B: the mRNA-GBS analysis, depicting colour coded populations of bulk samples within each region (S, H, A where n_m is not mown, m is mown). The third axis is representing 8.72 % genetic variation (Σ 36.60%).

Fig. 6
  • Download figure
  • Open in new tab
Fig. 6
  • Download figure
  • Open in new tab
Fig. 6

Neighbor Joining tree for the individuals and populations of Trifolium pratense across the different Biodiversity Exploratories (yellow: Schorfheide-Chorin; red: Hainich-Dün; blue: Swabian Alb). A: of the GBS analysis and B: of the mRNA-GBS analysis (n_m: not mown, m: mown)

Discussion

The ability to link population genetic and functional genomic analysis in a rapid, cost-effective, and technically relatively simple manner would be of great importance for a better understanding of naturally occurring variability and for breeding studies. This would allow for the simultaneous screening of diversity while identifying expression patterns and specific candidate genes involved in the response to certain species-specific environmental interactions. Currently this is very time consuming and costly (Bhat et al. 2016). Therefore, the method presented here, mRNA-GBS, aims to fill the gap by offering a low-cost reduced complexity transcriptome analysis (mRNA-GBS).

This is the first approach, linking a complexity reduced mRNA analysis (mRNA-GBS) with an in depth RNA-Seq analysis (Herbert et al. 2021) and a GBS approach on natural occurring plant populations and across a broader geographic scale. We tested the mRNA-GBS approach on several individuals of red clover from eleven populations and three regions in Germany. We hereby evaluate whether the analysis of intraspecific variation within and between populations and transcriptome responses is possible simultaneously. The mRNA-GBS approach revealed population genetic patterns, but linkage with mRNA-Seq data was not possible. The drawbacks and needed optimization steps are discussed in the following.

mRNA-GBS and comparison with RNA-Seq and GBS

Herbert et al. (2021) conducted an RNA-Seq analysis on one of the here also screened populations of red clover from Hainich-Dün (H) to compare the global transcriptional response to mowing under greenhouse conditions and in agricultural fields. They simulated mowing and compared the transcriptome response in mown and not mown T. pratense individuals, as in our analysis. Herbert et al. (2021) obtained a total number of short reads ranging from 44.7 to 58.1 million for each library, which on average is 10-times more per individual than in our study. Their sequencing approach comprised 608.041.012 raw reads for the analysis of only six different sites/treatments, of eight pooled samples while in our mRNA-GBS approach we investigated 13 plants on five to six fields in three regions in Germany. With this approach, they were able to identify 119 – 142 differentially expressed genes (DEGs, with a log2fold-change >2) that are up- or down-regulated when mown plants were compared with non-mown plants. The mRNA-GBS library was highly variable in terms of read depth per individual (80 bp on average), and pooling of samples did not allow us to correlate site-specific multifactorial influences of environmental responses in a statistically robust way. Only 50-86 % of the retrieved short sequences are located within the 100 bp region upstream of the poly(A) tail, and only 0.9 – 3-2 % are located within the last 25 bp, which hampered mRNA mapping and prevented the screening for differentially expressed genes (Table S1). SNP calling and expression studies were thus not possible.

However, also Herbert et al. (2021) discovered that plants grown in the field exhibited more and different stress responses than plants grown in greenhouses, leading them to conclude that field grown plants respond to multiple environmental stresses that are of site specific, abiotic, and biotic in origin. For example, they found some genes upregulated in mown plants being chitinase homologs suggesting that these plants are stressed by insects and/or fungi and that this stress may be more relevant to the plants than the loss of biomass due to mowing. With more than 65 different fungi and nematodes and more than 20 viruses, insects, and bacteria known to infect red clover (Duke 1981), our pilot study of mRNA-GBS across such a broad geographic and ecologically diverse range was too ambitious.

Our sequencing depth with an average of 1.1 million raw reads per sample for mRNA-GBS was too shallow to quantify gene expression differences. Hou et al (2013) proposed sequencing of 15-50 million reads to allow the detection of the majority of transcripts in human tissue (1C value between 2.9-3.1, Lander et al. 2001), so that a 15-fold higher read depth must be aimed for, which, however, does not meet our requirements that the method be inexpensive and easy to perform on multiple individuals. However, the high error rates resulting from the low sequencing depth are due to conceptual and methodological limitations of NGS sequencing, resulting in artifacts and a relatively high false positive rate of variants such as SNPs and InDels, not only affect the mRNA-GBS approach but also estimates of population genetic parameters (Dorant et al., 2019; Andrews et al., 2016; Cariou et al., 2016; Davey et al., 2011). This became apparent when we pooled the different individuals from mown and not mown populations from the mRNA-GBS analysis from each region, mapped them against a reference genome, and analyzed SNPs and compared them to the GBS analysis. The genetic diversity indices revealed significant inconsistencies between He-GBS (ØHe = 0.060) and He-mRNA-GBS (ØHe = 0.76) values. Our inconsistencies are based on the fact that different evolutionary mechanisms exert both neutral processes such as drift and immigration and adaptive processes such as selection, so that the different evolutionary origins of SNPs limit significance and may also overlap signals (Lamy et al., 2017; Vellend & Geber, 2005). Furthermore, Dorant et al. (2019) previously pointed out problems associated with GBS involving mutations at restriction sites that lead to allelic dropouts and PCR biases such that correct genetic diversity is not reflected and significant misinterpretation of commonly used statistics in population genetics studies leads to incorrect conclusions (Arnold et al., 2013, Cariou et al., 2016; Gautier et al., 2015). Several studies investigated the genetic diversity of red clover populations and germplasm collections, e.g., using RAPD (Campos-de-Quiroz & Ortega-Klose 2001; Ulloa et al. 2003), AFLP (Kölliker et al. 2003; Herrmann et al., 2005), and SSR (Gupta et al. 2017), and several of them found relatively high values for genetic diversity estimates similar to or slightly lower than those of our mRNA-GBS analysis. Pfeifer et al. (2018) compared GBS and AFLP data in an herbaceous perennial sedge species (Carex gayana) and found slightly higher estimates of genetic diversity with SNPs than with AFLP data, but also discovered some populations where this trend was reversed. SNP mutation rates are relatively low (10 × 10-8 to 10 × 10-9; Nachman & Crowell 2000; Pfeifer et al. 2018), lower than those of microsatellites (0.001 to 0.005; Pinto et al. 2013; Fischer et al. 2017), whereas AFLP mutation rates can exceed those of microsatellites (Kuchma et al. 2011).

STRUCTURE analyses revealed two genetic clusters for the GBS and pooled mRNA-GBS results, and the patterns were nearly region-specific in both analyses. In the mRNA-GBS, they were even treatment-specific (mown/not mown), which is weakly supported by PCA (Figs. 5) but no longer evident in the Neighbor Joining analysis. Deeper sequencing would potentially lead to the detection of mRNA sequences with lower copy number, resulting in stronger site-specific pattern recognition. GBS analysis revealed greater genetic similarity between individuals from S and H and a greater distance from A, with greater overlap in the mRNA-GBS data when all loci were considered; only at polymorphic loci did this pattern disappear. This is consistent with the results of other population genetic comparisons of plants studied in the Biodiversity Exploratories, e.g., Veronica chamaedrys (Kloss et al. 2011) in an AFLP study. While Kloss et al. (2011) found very little difference within and between populations, suggesting that the effects of genetic drift are counterbalanced by gene flow between populations, we found some differences. Both red clover and V. chamaedrys are commonly outcrossing perennials for which high gene flow is known to counteract the effects of genetic drift, either through high natural or human-induced dispersal of seeds and pollen or through large effective population sizes (Nybom 2004; Musche et al. 2008).

mRNA-GBS and other marker assisted approaches

The advantage of mRNA-GBS is that it provides SNPs of transcripts from very specific biological processes at a specific time point and under the conditions prevailing there that characterize the phenotype, even if we mainly target the far 3’ end. In contrast, the GBS approach and similar molecular techniques used for NGS-based population genomic analyses (e.g., Hy-Rad, ddRAD-Seq, Pool-Seq, Hy-Rad, restriction site-associated DNA capture (Rapture), bulk and low-coverage NGS, and others, e.g., discussed in Dorant et al. 2019) provide SNPs from genomic regions and reflect only genotype, whereas phenotype is influenced by both its genotype and environment. RNA-Seq experiments targeting the phenotype can currently only be performed for a limited number of individuals and replicates due to the high cost of library preparation and deep sequencing, and assignment to a reference genome is required (Pallares et al. 2020). For marker assisted breeding as well as a better understanding of natural variability in populations the mRNA-GBS approach aim to identify specific traits through the use of direct and indirect molecular markers to replace standard comparative in-depth transcriptomics (Collard & Makill 2008).

Currently one approach is published, investigating gene expression in non-model plant populations with reduced complexities (Marx et al. 2020) and in comparison with RNA-Seq by using a TagSeq approach. Marx et al. (2020) performed RNA-Seq analysis on four non-model species at their natural populations. They then mapped TagSeq data from individuals at weekly intervals over three weeks and were able to align the short sequences with the reference transcriptome. However, they did not analyze these findings in an population genetic context. The TM3’seq approach (Pallares et al. 2020) also targets 3’ ends of transcripts while preserving sample identity at each step and enables simultaneous high-throughput processing of individual samples, but this approach has not been explored on plant samples, yet.

Conclusion

In summary, we found that mRNA-GBS is a promising tool for population genetic analysis, but greater sequencing depth is required and fewer divergent populations need to be compared. The mRNA-GBS analysis described here resulted in too many divergent short sequence reads throughout the mRNA, making assignment difficult. It is recommended to focus more on generating mRNA regions upstream of the poly(a) tail. Experimental bias occurred in our analysis due to the use of NGS and GBS tools, which were pointed out previously. However, relative similarity and comparability of population genetic analysis is given, with mRNA-GBS data reflecting stronger signals of selection than neutral mutations compared with GBS data. Our approach has contributed to knowledge enhancement at a time when intensive research on genomic fingerprinting analyses and reduced RNA-Seq approaches is underway, particularly for non-model species

Financial Disclosure Statement

This work has been funded through the DFG Priority Program 1374 ‘Biodiversity Exploratories’ to Birgit Gemeinholzer (GE1242/14-1/14-2) and Anette Becker BE 2547/12-1/12-2). We used the de.NBI infrastructure (German Network for Bioinformatics Infrastructure, the de.NBI project is funded by the BMBF. FKZ 031A532 - 031A540).

Acknowledgements

We thank the managers of the three Exploratories and all former managers, for their work in maintaining the plot and project infrastructure, Christiane Fischer and Jule Mangels for their support through the central office, Andreas Ostrowski and Michael Owonibi for managing the central database, and Markus Fischer, Eduard Linsenmair, Dominik Hessenmöller, Daniel Prati, Ingo Schöning, Francois Buscot, Ernst-Detlef Schulze, Wolfgang Weisser and the late Elisabeth Kalko for their role in setting up the Biodiversity Exploratories project. Fieldwork permits were issued by the responsible state environmental offices of Baden-Württemberg, Thüringen, and Brandenburg (according to § 72 331 BbgNatSchG). We are grateful to Volker Wissemann, Sabine Mutz, Annalena Kurzweil, Dr. Thomas Groß and Andreas Kolter for lab and administrative support. We are thankful to Andrea Weisert for carrying out all RNA extractions and cDNA synthesis steps.

References

  1. Ahsyee RS, Vasiljevic S, Calic I, Zorc M, Karagic D, Surlan-Momirovic G. Genetic diversiy n redclover (Trifolium pratense L.) using SSR marker. Genetika 2014;46(3):949–961.
    OpenUrl
  2. ↵
    Andrews KR, Good JM, Miller MR, Luikart G, Hohenlohe PA. Harnessing the power of RADseq for ecological and evolutionary genomics. Nature Reviews Genetics. 2016;17: 81–92. doi: 10.1038/nrg.2015.28.
    OpenUrlCrossRefPubMed
  3. ↵
    Arnold B, Corbett-Detig RB, Hartl D, Bomblies K. RADseq underestimates diversity and introduces genealogical biases due to nonrandom haplotype sampling. Molecular Ecology. 2013;22: 3179–3190. doi: 10.1111/mec.12276.
    OpenUrlCrossRefWeb of Science
  4. ↵
    Bhat JA, Ali S, Salgotra RK, Mir ZA, Dutta S, Jadon V, Tyagi A, Mushtaq M, Jain N, Singh PK, Singh GP, Prabhu KV. Genomic Selection in the Era of Next Generation Sequencing for Complex Traits in Plant Breeding. Front. Genet. 2016;7:221. doi: 10.3389/fgene.2016.00221.
    OpenUrlCrossRef
  5. Caballero A, Villanueva B, Druet, T. On the estimation of inbreeding depression using different measures of inbreeding from molecular markers. Evolutionary Applications. 2020;14(2): 416–428.
    OpenUrl
  6. ↵
    Campos-de Quiroz H, Ortega-Klose F. Genetic variability among elite red clover (Trifolium pratense L.) parents used in Chile as revealed by RAPD markers. Euphytica. 2001;122: 61–67.
    OpenUrl
  7. ↵
    Catchen JM, Amores A, Hohenlohe PA, Cresko WA, Postlethwait JH, Koning DJ de. Stacks:Building and Genotyping Loci De Novo From Short-Read Sequences. G3 (Bethesda). 2011;1: 171–182. doi: 10.1534/g3.111.000240.
    OpenUrlCrossRefPubMed
  8. ↵
    Catchen JM, Hohenlohe PA, Bassham S, Amores A, Cresko WA. Stacks: An analysis tool set for population genomics. Mol Ecol. 2013;22: 3124–3140. doi: 10.1111/mec.12354.
    OpenUrlCrossRefPubMedWeb of Science
  9. ↵
    Cariou M, Duret L & Charlat S. How and how much does RAD-seq bias genetic diversity estimates? BMC Evolutionary Biology. 2016;16: 240. doi: 10.1186/s12862-016-0791-0.
    OpenUrlCrossRef
  10. ↵
    Chhatre VE, Emerson KJ. StrAuto: Automation and parallelization of STRUCTURE analysis. BMC bioinformatics. 2017;18: 192. doi: 10.1186/s12859-017-1593-0.
    OpenUrlCrossRef
  11. ↵
    Collard BCY, Mackill DJ. Marker-assisted selection: An approach for precision plant breeding in the twenty-first century. Philosophical Transactions of the Royal Society. 2008;363: 557–572.
    OpenUrlCrossRefPubMed
  12. Collins RP, Helgadottir A, Frankow-Lindberg BEF, Skot L, Jones C, Skot KP. Temporal changes in population genetic diversity and structure in red and white clover grown in three contrasting environments in northern Europe. Annals of Botany. 2012;110: 1341–1350.
    OpenUrlCrossRefPubMed
  13. ↵
    Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics. 2011;12: 499–510. doi: 10.1038/nrg3012.
    OpenUrlCrossRefPubMed
  14. ↵
    deVega JJ, Ayling S, Hegarty M, Kudrna D, Goicoechea JL, Ergon Å, et al. Red clover (Trifolium pratense L.) draft genome provides a platform for trait improvement. Sci Rep. 2015;5: 17394. doi: 10.1038/srep17394.
    OpenUrlCrossRef
  15. ↵
    Dewhurst R. Milk production from silage. Comparison of grass, legume and maize silages and their mixtures. Agric. Food Sci. 2013;22: 57–69. doi: 10.23986/afsci.6673.
    OpenUrlCrossRef
  16. ↵
    Dias PMB, Julier B, Sampoux JP, Barre P, Dall’Agnol M. Genetic diversity in red clover (Trifolium pratense L.) revealed by morphological and microsatellite (SSR) markers. Euphytica. 2008;160: 189–205. doi: 10.1007/s10681-007-9534-z.
    OpenUrlCrossRef
  17. ↵
    Dobin A, Gingeras TR. Mapping RNA-seq Reads with STAR Current Protocols in Bioinformatics. 2015. doi: 10.1002/0471250953.bi1114s51.
    OpenUrlCrossRefPubMed
  18. ↵
    Dorant Y, Benestan L, Rougemont Q, Normandeau, Boyle B, Rochette R, Bernatchez L. Comparing Pool-seq, Rapture, and GBS genotyping for inferring weak population structure: The American lobster (Homarus americanus) as a case study, Ecology and Evolution. 2019: 9 (11): 6606–6623.
    OpenUrl
  19. ↵
    Duke JA. Handbook of legumes of world economic importance. NewYork: Plenum Press; 1981.
  20. ↵
    Earl DA, von Holdt BM. STRUCTURE HARVESTER: A website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour 4. 2012;359–361. doi: 10.1007/s12686-011-9548-7
    OpenUrlCrossRefPubMed
  21. ↵
    Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE. A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species. PLOS ONE. 2011;6 (5): e19379.
    OpenUrlCrossRefPubMed
  22. ↵
    Eriksen J, Askegaard M, Søegaard K. Complementary effects of red clover inclusion in ryegrass-white clover swards for grazing and cutting. Grass Forage Sci. 2014; 69: 241–50. doi: 10.1111/gfs.12025.
    OpenUrlCrossRef
  23. ↵
    Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol. 2005;14: 2611–2620. doi: 10.1111/j.1365-294X.2005.02553.x
    OpenUrlCrossRefPubMedWeb of Science
  24. ↵
    Excoffier L, Lischer HEL. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour. 2010; 10:564–567. doi: 10.1111/j.1755-0998.2010.02847.x.
    OpenUrlCrossRefPubMed
  25. ↵
    Fischer M, Bossdorf O, Gockel S, Hänsel F, Hemp A, Hessenmöller D, et al. Implementing large-scale and long-term functional biodiversity research: the biodiversity Exploratories. Basic Applied Ecol. 2010;11: 473–85. doi: 10.1016/j.baae.2010.07.009.
    OpenUrlCrossRefWeb of Science
  26. ↵
    Fischer MC, Rellstab C, Leuzinger M, Roumet M, Gugerli F, Shimizu KK,… Widmer A. Estimating genomic diversity and population differentiation – an empirical comparison of microsatellite and SNP variation in Arabidopsis halleri. BMC Genomics. 2017; 18(69). Doi: 10.1186/s12864-016-3459-7.
    OpenUrlCrossRef
  27. Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv preprint arXiv:1207.3907 [q-bio.GN] 2012
  28. ↵
    Gautier M. Genome-wide scan for adaptive divergence and association with population-specific covariates. Genetics. 2015;201: 1555–1579. doi: 10.1534/genetics.115.181453.
    OpenUrlAbstract/FREE Full Text
  29. ↵
    Gould BA, Chen Y and Lowry DB. Gene regulatory divergence between locally adapted ecotypes in their native habitats. Molecular Ecology. 2018;27: 4174–4188.
    OpenUrl
  30. Gross T, Müller CM, Becker A, Gemeinholzer B, Wissemann V. Common garden versus common practice – Phenotypic changes in Trifolium pratense L. in response to repeated mowing. Journal of Applied Botany and Food Quality. 2021;94: 1–6. doi: 10.5073/JABFQ.2021.094.001.
    OpenUrlCrossRef
  31. ↵
    Gupta M, Sharma V, Sing SK, Chahota RK, Sharma TR. Analysis of genetic diversity and structure in a genebank collection of red clover (Trifolium pratense L.) using SSR markers. Plant Genetic Resources: Characterization and Utilization. 2017;15(4): 376–379. doi:10.1017/S1479262116000034
    OpenUrlCrossRef
  32. ↵
    Herbert DB, Ekschmitt K, Wissemann V, Becker A. Cutting reduces variation in biomass production of forage crops and allows low-performers to catch up: A case study of Trifolium pratense L. (red clover). Plant Biol (Stuttg). 2018;20: 465–73. doi: 10.1111/plb.12695.
    OpenUrlCrossRef
  33. ↵
    Herbert DB, Gross T, Rupp O, Becker A. Transcriptome analysis reveals major transcriptional changes during regrowth after mowing of red clover (Trifolium pratense). BMC Plant Biology. 2021;21: 95. doi: 10.1186/s12870-021-02867-0.
    OpenUrlCrossRef
  34. ↵
    Herrmann D, Boller B, Widmer F, Kölliker R. Optimization of bulked AFLP analysis and its application for exploring diversity of natural and cultivated populations of red clover. Genome. 2005;48:474–486.
    OpenUrlPubMed
  35. ↵
    Hohenlohe PA, Amish SJ, Catchen JM, Allendorf FW, Luikart G. Next-generation RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow and westslope cutthroat trout. Molecular Ecology Resources. 2011;11: 117–122.
    OpenUrl
  36. ↵
    Hou R, Yang ZX, Li MH, et al. Impact of the next-generation sequencing data depth on various biological result inferences. Sci China Life Sci. 2013;56: 104–109. doi: 10.1007/s11427-013-4441-0.
    OpenUrlCrossRef
  37. ↵
    Isobe S, Kölliker R, Hisano H, Sasamoto S, Wada T, Klimenko I, Okumura K, Tabata S. Construction of a consensus linkage map for red clover (Trifolium pratense L.). BMC Plant Biol. 2009;9(57) 1–11.
    OpenUrlCrossRefPubMed
  38. ↵
    Jakobsson M, Rosenberg NA. CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics. 2007;23: 1801–1806. doi: 10.1093/bioinformatics/btm233.
    OpenUrlCrossRefPubMedWeb of Science
  39. ↵
    Joly D, Faure D. Next-generation sequencing propels environmental genomics to the front line of research, Heredity. 2015;114: 429–430.
    OpenUrl
  40. ↵
    Jombart T. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics. 2008;24: 1403–1405.
    OpenUrlCrossRefPubMedWeb of Science
  41. ↵
    Keenan K, McGinnity P, Cross TF, Crozier WW, Prodöhl PA. diveRsity: An R package for the estimation and exploration of population genetics parameters and their associated errors. Methods Ecol Evol. 2013;4: 782–788. doi: 10.1111/2041-210X.12067.
    OpenUrlCrossRef
  42. ↵
    Kleen J, Taube F, Gierus M. Agronomic performance and nutritive value of forage legumes in binary mixtures with perennial ryegrass under different defoliation systems. J. Agric. Sci. 2011;149: 73–84. doi: 10.1017/S0021859610000456.
    OpenUrlCrossRef
  43. ↵
    Kölliker R, Herrmann D, Boller B and Widmer F. Swiss Mattenklee landraces, a distinct and diverse genetic resource of red clover (Trifolium pratense L.). Theoretical and Applied Genetics. 2003;107: 306–315.
    OpenUrlCrossRefPubMedWeb of Science
  44. ↵
    Kloss L, Fischer M, Durka W. Land-use effects on genetic structure of a common grassland herb: A matter of scale. Basic and Applied Ecology. 2011;12: 440–448.
    OpenUrl
  45. Kouamé CN and Quesenberry KH. Cluster analysis of a world collection of red clover germplasm. Genetic Resources and Crop Evolution. 1993;40: 39–47.
    OpenUrl
  46. Lamy T, Jarne P, Laroche F, et al. Variation in habitat connectivity generates positive correlations between species and genetic diversity in a metacommunity. Mol Ecol. 2013;22: 4445–56.
    OpenUrlCrossRef
  47. ↵
    Li ZP, Wang C, You J, Yu X, Zhang F, Yan Z, Ye et al. Combined GWAS and eQTL analysis uncovers a genetic regulatory network orchestrating the initiation of secondary cell wall development in cotton. New Phytologist. 2020;226: 1738–1752.
    OpenUrl
  48. ↵
    Lischer HEL, Excoffier L. PGDSpider: an automated data conversion tool for connecting population genetics and genomics programs. Bioinformatics (Oxford, England). 2012;28: 298–299. doi: 10.1093/bioinformatics/btr642.
    OpenUrlCrossRefPubMedWeb of Science
  49. ↵
    Lohman BK, Weber JN and Bolnick DI. Evaluation of TagSeq, a reliable low-cost alternative for RNAseq. Molecular Ecology Resources. 2016;16: 1315–1321.
    OpenUrl
  50. ↵
    Marx HE, Scheidt S, Barker MS and Dlugosch KM. TagSeq for gene expression in non-model plants: A pilot study at the Santa Rita Experimental Range NEON core site. Applications in Plant Sciences. 2020;8(11): e11398.
    OpenUrl
  51. ↵
    Mead AJ, Peñaloza Ramirez J, Bartlett MK, Wright JW, Sack L and Sork VL. Seedling response to water stress in valley oak (Quercus lobata) is shaped by different gene networks across populations. Molecular Ecology. 2019;28: 5248–5264.
    OpenUrlCrossRef
  52. Medoukali I, Bellil I, Khelifi D. Evaluation of Genetic Variability in Algerian Clover (Trifolium L.) Based on Morphological and Isozyme markers. Czech J. Genet. Plant Breed. 2015; 51(2): 50–61.
    OpenUrl
  53. ↵
    Müller CM, Linke B, Strickert M, Ziv Y, Giladi I, Gemeinholzer B. Comparative genomic analysis of three co-occurring annual Asteraceae along micro-geographic fragmentation scenarios. PPEES, 2019;42. doi: 10.1016/j.ppees.2019.125486.
    OpenUrlCrossRef
  54. ↵
    Musche M, Settele J, Durka W. Genetic population structure and reproductive fitness in the plant Sanguisorba officinalis in populations supporting colonies of an endangered Maculinea butterfly. International Journal of Plant Sciences. 2008;169: 253–262.
    OpenUrlCrossRefWeb of Science
  55. ↵
    Nachman MW, Crowell SL. Estimate of the mutation rate per nucleotide in humans. Genetics. 2000;156, 297.
    OpenUrlAbstract/FREE Full Text
  56. ↵
    Nei M. Analysis of Gene Diversity in Subdivided Populations. Proc Natl Acad Sci USA. 1973;70: 3321–3323. doi: 10.1073/pnas.70.12.3321.
    OpenUrlAbstract/FREE Full Text
  57. ↵
    Nybom H. Comparison of different nuclear DNA markers for estimating intraspecific genetic diversity in plants. Molecular Ecology. 2004;13: 1143–1155.
    OpenUrlCrossRefPubMedWeb of Science
  58. Pagnotta MA, Annicchiarico P, Farina A, Proietti S. Characterizing the molecular and morphophysiological diversity of Italian red clover. Euphytica. 2011;179:393–404.
    OpenUrl
  59. ↵
    Pallares LF, Picard S, Ayrolse JF. TM39seq: A Tagmentation-Mediated 39 Sequencing Approach for Improving Scalability of RNAseq Experiments Genes, Genomes, Genetics. 2020;10: 143–150. doi: 10.1534/g3.119.400821.
    OpenUrlAbstract/FREE Full Text
  60. ↵
    Paradis E, Claude J, Strimmer K. APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics (Oxford, England). 2004;20: 289–290. doi: 10.1093/bioinformatics/btg412.
    OpenUrlCrossRefPubMedWeb of Science
  61. ↵
    Pfeifer VW, Ford BM, Housset J, McCombs A, Blanco-Pastor JL, Gouin N, Manel S, Bertin A. Partitioning genetic and species diversity refines our understanding of species–genetic diversity relationships. Ecology and Evolution. 2018;8: 12351–12364.
    OpenUrl
  62. ↵
    Pinto N, Magalhães M, Conde-Sousa E, Gomes C, Pereira R, Alves C.… Amorim A. Assessing paternities with inconclusive STR results: The suitability of bi-allelic markers. Forensic Science International: Genetics. 2013;7: 16–21. https://doi.org/10.1016/j.fsigen.2012.05.002
    OpenUrl
  63. ↵
    Pritchard JK, Stephens M, Peter Donnelly. Inference of Population Structure using multilocus genotype data. Genetics. 2000;155: 945–959.
    OpenUrl
  64. ↵
    Purugganan MD, Jackson SA. Advancing crop genomics from lab to field, Nature Genetics. 2021;53: 595–601.
    OpenUrl
  65. R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/. doi: DOI https://www.R-project.org/
  66. ↵
    Rosenberg NA. distruct: A program for the graphical display of population structure. Mol Ecol Notes. 2004;4: 137–138. doi: 10.1046/j.1471-8286.2003.00566.x.
    OpenUrlCrossRefPubMedWeb of Science
  67. ↵
    1. Taylor NL
    Smith RR, Taylor NL, Bowley SR. Red clover. In: Taylor NL (ed.) Clover Science and Technology. Madison, WI: ASA Special Publication. 1985;25: 471–490.
    OpenUrl
  68. ↵
    Ulloa O, Ortega F, Campos H. Analysis of genetic diversity in red clover (Trifolium pratense L.) breeding populations as revealed by RAPD genetic markers. Genome. 2003;46:529–535.
    OpenUrlPubMed
  69. ↵
    Vellend M, Geber MA. Connections between species diversity and genetic diversity. Ecology letters. 2005;8(7): 767–781.
    OpenUrlCrossRefWeb of Science
  70. ↵
    Wang Y, Lv H, Xiang X, Yang A, Feng Q, Dai P. et al. Construction of a SNP Fingerprinting Database and Population Genetic Analysis of Cigar Tobacco Germplasm Resources in China. Frontiers in plant science. 2021:12.
  71. ↵
    Wisecaver JH, Borowsky AT, Tzin V, Jander G, Kliebenstein DJ, Rokas A. A global co-expression network approach for connecting genes to specialized metabolic pathways in plants. The Plant Cell. 2017;29: 944–959.
    OpenUrlAbstract/FREE Full Text
  72. ↵
    Yates S, Swain MT, Hegarty MJ, Chernukin I, Lowe M, Allison GG, Ruttink T, Abberton MT, Jenkins G, Skøt L. De novo assembly of red clover transcriptome based on RNA-Seq data provides insight into drought response, gene discovery and marker identification. BMC genomics. 2014; 15(1): 1–15.
    OpenUrlCrossRefPubMed
Back to top
PreviousNext
Posted December 02, 2021.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
GBS and a newly developed mRNA-GBS approach to link population genetic and transcriptome analyses reveal pattern differences between sites and treatments in red clover (Trifolium pratense L.)
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
GBS and a newly developed mRNA-GBS approach to link population genetic and transcriptome analyses reveal pattern differences between sites and treatments in red clover (Trifolium pratense L.)
B Gemeinholzer, O Rupp, A Becker, M. Strickert, C-M Müller
bioRxiv 2021.11.30.470612; doi: https://doi.org/10.1101/2021.11.30.470612
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
GBS and a newly developed mRNA-GBS approach to link population genetic and transcriptome analyses reveal pattern differences between sites and treatments in red clover (Trifolium pratense L.)
B Gemeinholzer, O Rupp, A Becker, M. Strickert, C-M Müller
bioRxiv 2021.11.30.470612; doi: https://doi.org/10.1101/2021.11.30.470612

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Plant Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (3514)
  • Biochemistry (7365)
  • Bioengineering (5342)
  • Bioinformatics (20318)
  • Biophysics (10041)
  • Cancer Biology (7773)
  • Cell Biology (11348)
  • Clinical Trials (138)
  • Developmental Biology (6450)
  • Ecology (9979)
  • Epidemiology (2065)
  • Evolutionary Biology (13354)
  • Genetics (9370)
  • Genomics (12607)
  • Immunology (7724)
  • Microbiology (19087)
  • Molecular Biology (7459)
  • Neuroscience (41134)
  • Paleontology (300)
  • Pathology (1235)
  • Pharmacology and Toxicology (2142)
  • Physiology (3177)
  • Plant Biology (6878)
  • Scientific Communication and Education (1276)
  • Synthetic Biology (1900)
  • Systems Biology (5328)
  • Zoology (1091)