Vitamin D deficiency and SARS ‑ CoV ‑ 2 infection: Big-data analysis from March 2020 to March 2021. D-COVID study

Background: Vitamin D has been proposed to have immunomodulatory functions and therefore play a role in coronavirus infection (COVID-19). However, there is no conclusive evidence on its impact on COVID-19 infection and evolution. Objective: To study the association between COVID-19 infection and vitamin D deficiency in patients of a terciary university hospital. To investigate the clinical evolution and prognosis of patients with COVID-19 and vitamin D deficiency. Methods: Using big-data analytics and artificial intelligence through the SAVANA Manager clinical platform, we analysed clinical data from patients with COVID-19 atended in a terciary university hospital from March 2020 to March 2021. Results: Of the 143.157 analysed patients, 36.261 subjects had COVID-19 infection (25.33%); during this period; of these 2588 had vitamin D deficiency (7.14%). Among subjects with COVID-19 and vitamin D deficiency, there was a higher proportion of women OR 1.45 [95% CI 1.33-1.57], adults older than 80 years OR 2.63 [95%CI 2.38-2.91], people living in nursing homes OR 2.88 [95%CI 2.95-3.45] and walking dependence OR 3.45 [95%CI 2.85-4.26]. Regarding clinical course, a higher number of subjects with COVID-19 and vitamin D deficiency required hospitalitation OR 2.41 [95%CI 2.22-2-61], intensive unit care (ICU) OR 2.22 [95% CI 1.64-3.02], had a longer mean hospital stay 3.94 (2.29) p=0.02 and higher mortality OR 1.82 [95%CI 1.66-2.01].) Conclusion: Low serum 25 (OH) Vitamin-D level was significantly associated with a worse clinical evolution and prognosis of COVID-19 infection. We found a higher proportion of institutionalised and dependent people over 80 years of age among patients with COVID-19 and vitamin D deficiency.


Introduction
Vitamin D is a steroid with functions classically associated with bone metabolism, although in recent years important extraosseous regulatory functions have been described, among which immunomodulatory ones stand out (1). Vitamin D seems to activate innate immunity by stimulating antigen presentation to macrophages, activating neutrophils and T cells and reducing the cytokine storm at the site of infection (2).
In this regard, several hypotheses have been put forward on the role of vitamin D in COVID-19 infection, given that some studies show a higher risk of respiratory infections in patients with vitamin D deficiency and others show a higher incidence of  infection and mortality in countries with a high prevalence of vitamin D deficiency (3)(4)(5)(6)(7).
On the other hand, there is a high prevalence of vitamin D deficiency in the Spanish population and particularly in the elderly and institutionalized population (8)(9). Some of the factors associated with vitamin D deficiency are frequently found in this population group: low sun exposure, nutritional disorders or chronic kidney disease. In addition, the confinement imposed in the last two years for disease control may have increased vitamin D deficiency in this group of patients, mainly because of reduced sun exposure and worsening of underlying nutritional disorders.
Therefore, there is a possibility that vitamin D deficiency may be related to a greater severity of symptoms and worse clinical course, with the elderly population being especially vulnerable to this situation.
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in Information from EHRs was extracted using Natural Language Processing (NLP) and artificial intelligence (AI) techniques. SAVANA Manager was used to analysed this free-. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The terminology used by SAVANA is based on multiple sources such as SNOMED CT [10], which includes medical codes, concepts, synonyms, and definitions regarding symptoms, diagnoses, body structures and substances commonly used in clinical documentation.
Due to the novel methodological approach of this study, we complemented our clinical findings with the assessment of EHRead's performance. This evaluation was aimed at verifying the system's accuracy in identifying records that contain mentions of vitamin D . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted October 28, 2022. ; https://doi.org/10.1101/2022.10.27.514012 doi: bioRxiv preprint deficiency , COVID-19 and its related variables [10]. Briefly, the annotations made by the medical experts were used to generare the gold standard to assess the performance of EHRead's output; performance is calculated in terms of the standard metrics of accuracy (P), recall (R) and their harmonic mean F-score [11]. The linguistic evaluation of COVID-19 variable has been analyzed in the context of this study, yielding an accuracy, recall and F-score of 0.61, 0.87, and 0.72, respectively. The Interannotator Agreement was 0.70.
Other primary variables such as ipovitaminosis D and malnutrition yield and F-score higher tan 0,70.
All statistical analyses were conducted using SPSS software (version 25.0; IBM, Armonk, NY, USA). Unless otherwise indicated, qualitative variables are expressed as absolute frequencies and percentages, while quantitative variables are expressed as mean±SD. For the assessment of statistical significance of numerical variables, we used independentsamples t-tests. To measure the relative distribution of patients assigned to different categories of qualitative variables, we used Chi-squared tests. In all cases, a p-value for statistical significance was set at 0.05.
We followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines for reporting observational studies [12]. The study was conducted following legal and regulatory requirements and followed research practices described in the International Conference on Harmonisation guidelines for good clinical practice, the Declaration of Helsinki in its latest edition, the guidelines for good pharmacoepidemiology practice and local regulations. Given the retrospective and observational nature of the study, physicians' prescribing habits and patient assignment to a specific therapeutic strategy were solely determined by the physician, team or . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted October 28, 2022. ; https://doi.org/10.1101/2022.10.27.514012 doi: bioRxiv preprint hospital concerned. The study involves a combination of structured and unstructured data existing in EHRs and therefore no intervention would be performed on the research subjects. Therefore, the study does not entail any risk for the participants. Likewise, standard informed consent does not apply to this study. The research is with aggregated data and it would be impossible to identify patients in order to seek their informed consent.
All actions toward data protection were taken in accordance with the European data protection authorities' code of good practice regarding big-data projects and the European General Data Protection Regulation (GDPR). Patients who do not have the data collected in the clinical history were excluded from the study.

Study variables
We used the following definitions to categorize study variables: -Vitamin D deficiency: considering those subjects with a diagnosis of vitamin D deficiency, hypovitaminosis or when vitamin D blood levels were < 30mg/dl.  Differences during the three periods of maximum incidence: During the first period of pandemia (March to June 2020) patients with COVID-19 and vitamin D deficiency have the same evolution and complications described in the total sample. In the second period (August to October 2020) we find that hospitalitation, Intensive care and mortality are higher in COVID-19 patients with vitamin D deficiency but mean hospital stay is not significatly longer. The third period (January to March 2021) shows differences in clinical course in number of hospitalitations, Intensive care and mortality but not in mean hospital stay ( Table 3).

Discusión
In this large retrospective case-control study, we study the prevalence, characteristics and evolution of Vitamin D deficit in COVID-19 patients atended in a terciary University Hospital in Madrid (Spain) in the máximum period of pandemia.
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted October 28, 2022. ; https://doi.org/10.1101/2022.10.27.514012 doi: bioRxiv preprint We find a significat association between vitamin D deficit and the risk of severe COVID-19 disease when they get infected; higher risk of hospitalitation, more patients require Intensive care attention, higher mortality and longer hospital stay.
Signifcant associations were also found between patients with COVID-19 plus vitamin D deficit and sex, age, place of residence, walking dependence and comorbidity factors. We found no differences between the three periodos of máximum incidence, in number of hospitalitations, Intensive care and mortality but not in mean hospital stay, Our results confrms several others published showing association between low vitamin D levels and COVID-19, particularly those sufering from severe disease and complications [13][14][15] Diaz-Curiel et al found similar results in a Spanish population with increased risk of hospital admission and need for critical care, but they didn´t find relationship between vitamin D levels and mortality. [16].
The relationship between vitamin D and COVID-19 infection has been raised because significant association between low serum 25(OH)D levels and the severity of acute respiratory tract infections was found prior to the pandemic [17][18].
Vitamin D modulates angiotensin receptor expression in the lung, stimulates pulmonary surfactant production, reduces hyperinflammatory cytokine storm and increases levels of regulatory T lymphocytes, all of which are closely related to pulmonary infection with RVOC-19 [19].
On the other hand, vitamin D has also important protective efects on the cardiovascular system, including augmentation of myocardial contractility and anti-thrombotic efects and Vitamin D deficit has been associated with other cardiovascular factors such as diabetes, hypertension or dyslipidemia. All these facts could have protective effects against cardiovascular complications in patients with COVID-19 infecction. [20].
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in Hypovitaminosis D is commonly observed in the elderly population and in this study we find that patients with COVID-19 and vitamin D deficiency are older, with walking dificulties so they depend on others to walk outside, mainly living in nursing homes and with chronic disseases associated to vitamin D deficiency like malnutrition, obesity or real insuficiency.
There are few recent studies studying prevalence of vitamin D deficiency in the Spanish population, although all agree on a high prevalence in this group of age. Gonzalez-Molero et al found that the 33.9% of the Spanish population may be at risk for Vitamin D defict [21] but others describe a prevalence up to 80% in people living in nursing homes. [8][9].
As a result of the high prevalence of vitamin D defict in older adults and the results of this study showing high risk of complications in pacientes with COVID-19 plus vitamin D defict, it sould be positive to consider the benefit of vitamin D suplementation in older population especially those at higher risk (institutionalized or walking dependence).
The main strengths of the present study are the large sample size that has been analyzed and the access to real-world evidence. There are also some limitations; the first one is that, unlike classical research methods, reproducibility is not generally considered in bigdata studies, since the latter involves large amounts of information collected from the whole target population. Because we exclusively analysed the data captured in EHRs, the quality of the results reported for some variables is directly tied to the quality of the clinical records; in many cases, EHRs may be partially incomplete and not capture all the relevant clinical information from a given patient. Like in other retrospective studies, some variables can be not properly documented and were therefore not analysed.
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted October 28, 2022. ; https://doi.org/10.1101/2022.10.27.514012 doi: bioRxiv preprint Finally, our study sample comprised COVID-19 cases confirmed by PCR test and Antigenic test plus suspected clinical and epidemiological circumstances, so it is posible that some false positive cases could have been included in the final study sample.

Conclusions
In this large observational population study, we show a signifcant association between vitamin D defciency and the risks of severe disease in patients with COVID-19 infection and a higher proportion of institutionalised and dependent people over 65 years of age among patients with Covid 19 and vitamin D deficiency.
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted October 28, 2022. ; https://doi.org/10.1101/2022.10.27.514012 doi: bioRxiv preprint