Abstract
R-loop, a three-stranded nucleic acid structure, has been recognized to play pivotal roles in critical physiological and pathological processes. Multiple technologies have been developed to profile R-loops genome-wide, but the existing data suffer from major discrepancies on determining genuine R-loop localization and its biological functions. Here, we experimentally and computationally evaluate eight representative R-loop mapping technologies, and reveal inherent biases and artifacts of individual technologies as key sources of discrepancies. Analyzing signals detected with different R-loop mapping strategies, we note that genuine R-loops predominately form at gene promoter regions, whereas most signals in gene body likely result from structured RNAs as part of repeat-containing transcripts. Interestingly, our analysis also uncovers two classes of R-loops: The first class consists of typical R-loops where the single-stranded DNA binding protein RPA binds both the template and non-template strands. By contrast, the second class appears independent of Pol II-mediated transcription and is characterized by RPA binding only in the template strand. These two different classes of RNA:DNA hybrids in the genome suggest distinct biochemical activities involved in their formation and regulation. In sum, our findings will guide future use of suitable technology for specific experimental purposes and the interpretation of R-loop functions.
Competing Interest Statement
The authors have declared no competing interest.