New Results
Language modelling for biological sequences – curated datasets and baselines
View ORCID ProfileJose Juan Almagro Armenteros, View ORCID ProfileAlexander Rosenberg Johansen, View ORCID ProfileOle Winther, View ORCID ProfileHenrik Nielsen
doi: https://doi.org/10.1101/2020.03.09.983585
Jose Juan Almagro Armenteros
1Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
Alexander Rosenberg Johansen
2Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
Ole Winther
2Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
Henrik Nielsen
1Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark

Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Posted March 09, 2020.
Language modelling for biological sequences – curated datasets and baselines
Jose Juan Almagro Armenteros, Alexander Rosenberg Johansen, Ole Winther, Henrik Nielsen
bioRxiv 2020.03.09.983585; doi: https://doi.org/10.1101/2020.03.09.983585
Subject Area
Subject Areas
- Biochemistry (10821)
- Bioengineering (8065)
- Bioinformatics (27374)
- Biophysics (14020)
- Cancer Biology (11160)
- Cell Biology (16099)
- Clinical Trials (138)
- Developmental Biology (8807)
- Ecology (13329)
- Epidemiology (2067)
- Evolutionary Biology (17394)
- Genetics (11705)
- Genomics (15961)
- Immunology (11057)
- Microbiology (26155)
- Molecular Biology (10679)
- Neuroscience (56729)
- Paleontology (422)
- Pathology (1737)
- Pharmacology and Toxicology (3012)
- Physiology (4567)
- Plant Biology (9666)
- Synthetic Biology (2699)
- Systems Biology (6994)
- Zoology (1513)