Abstract
The BLAST program [Altschul et al., 1990, J. Mol. Biol., 215:403-410] stands as a widely used tool for the search of the most similar sequences, while the iterative Ψ-BLAST program [Altschul et al., 1997, Nucl. Acids Res., 25:3389-3402] offers a high sensitivity for detecting remote homologs of the query sequence through an iterative usage of the BLAST search. However, the number of iterations that have to be used by the Ψ-BLAST is rather poorly justified in the literature. Our study shows that, as the number of iterations increases, Ψ-BLAST rapidly loses the ability to be guided by the query sequence in the search for homologs. When working with the non-redundant (nr) sequence database of 2021, Ψ-BLAST, already after the second iteration, retains the query sequence at the top of the list of the found homologs to this sequence in only 18% of cases. Moreover, a query sequence is still listed among homologs found by Ψ-BLAST after the recommended 10 iterations [Altschul et al., 1997, Nucl. Acids Res., 25:3389-3402] in only 42% of cases. Using a considerably smaller nr database-2011 as a reference, we reveal that these effects intensify over time. Our findings underscore the necessity for circumspection when interpreting Ψ-BLAST outcomes; the degree of vigilance must increase with the database size. A vigilant monitoring of the position of the query sequence in the array of detected homologs is needed. We recommend using the disappearance of the query sequence from the list of homologs produced by Ψ-BLAST as a criterion to conclude the Ψ-BLAST iterations.
Competing Interest Statement
The authors have declared no competing interest.