RT Journal Article SR Electronic T1 Atria: An Ultra-fast and Accurate Trimmer for Adapter and Quality Trimming JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.09.07.459340 DO 10.1101/2021.09.07.459340 A1 Jiacheng Chuan A1 Aiguo Zhou A1 Lawrence Richard Hale A1 Miao He A1 Xiang Li YR 2021 UL http://biorxiv.org/content/early/2021/09/09/2021.09.07.459340.abstract AB Background As Next Generation Sequencing takes a dominant role in terms of output capacity and sequence length, adapters attached to the reads and low-quality bases hinder the performance of downstream analysis directly and implicitly, such as producing false-positive single nucleotide polymorphisms (SNP), and generating fragmented assemblies. A fast trimming algorithm is in demand to remove adapters precisely, especially in read tails with relatively low quality.Findings We present a trimming program named Atria. Atria matches the adapters in paired reads and finds possible overlapped regions with a super-fast and carefully designed byte-based matching algorithm (O(n) time with O(1) space). Atria also implements multi-threading in both sequence processing and file compression and supports single-end reads.Conclusions Atria performs favorably in various trimming and runtime benchmarks of both simulated and real data with other cutting-edge trimmers. We also provide an ultra-fast and lightweight byte-based matching algorithm. The algorithm can be used in a broad range of short-sequence matching applications, such as primer search and seed scanning before alignment.Availability & Implementation The Atria executables, source code, and benchmark scripts are available at https://github.com/cihga39871/Atria under the MIT license.Competing Interest StatementThe authors have declared no competing interest.CPUCentral processing unitDNADeoxyribonucleic acidGBGigabyteMCCMatthew’s correlation coefficientNGSNext-generation sequencingPPVPositive predictive valueRAMRandom-access memoryRNARibonucleic acidSNPSingle nucleotide polymorphismSSDSolid-state driveTBTerabyteUIntUnsigned integerUInt64Unsigned 64-bit integerWGSWhole-genome sequencing.