Abstract
Here we describe the SweGen dataset, a high-quality map of genetic variation in the Swedish population. This data represents a basic resource for clinical genetics laboratories as well as for sequencing-based association studies, by providing information on the frequencies of genetic variants in a cohort that is well matched to national patient cohorts. To select samples for this study, we first examined the genetic structure of the Swedish population using high-density SNP-array data from a nation-wide population based cohort of over 10,000 individuals. From this sample collection, 1,000 individuals, reflecting a cross-section of the population and capturing the main genetic structure, were selected for whole genome sequencing (WGS). Analysis pipelines were developed for automated alignment, variant calling and quality control of the sequencing data. This resulted in a whole-genome map of aggregated variant frequencies in the Swedish population that we hereby release to the scientific community.