ABSTRACT
The variable sigma (σ) subunit of the bacterial RNA polymerase holoenzyme determines promoter specificity and facilitate open complex formation during transcription initiation. Understanding σ-factor binding sequences is therefore crucial for deciphering bacterial gene regulation. Here, we present a data-driven high-throughput approach that utilizes an extensive library of 1.54 million DNA templates providing artificial promoters and 5′ UTR sequences for σ-factor DNA binding motif discovery. This method combines the generation of extensive DNA libraries, in vitro transcription, RNA aptamer selection, and deep DNA and RNA sequencing. It allows direct assessment of promoter activity, identification of transcription start sites, and quantification of promoter strength based on mRNA production levels. We applied this approach to map σ54 DNA binding sequences in Pseudomonas putida. Deep sequencing of the enriched RNA pool revealed 64,966 distinct σ54 binding motifs, significantly expanding the known repertoire. This data-driven approach surpasses traditional methods by directly evaluating promoter function and avoiding selection bias based solely on binding affinity. This comprehensive dataset enhances our understanding of σ-factor binding sequences and their regulatory roles, opening avenues for new research in biology and biotechnology.
Competing Interest Statement
G.S.D. and R.L. are founders of Syngens. However, the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The all other authors declare no competing interests.