RT Journal Article SR Electronic T1 SigLASSO: a LASSO approach jointly optimizing sampling likelihood and cancer mutation signatures JF bioRxiv FD Cold Spring Harbor Laboratory SP 366740 DO 10.1101/366740 A1 Li, Shantao A1 Crawford, Forrest W. A1 Gerstein, Mark B. YR 2018 UL http://biorxiv.org/content/early/2018/07/10/366740.abstract AB Multiple mutational processes drive carcinogenesis, leaving characteristic signatures on tumor genomes. Determining the active signatures from the full repertoire of potential ones can help elucidate the mechanisms underlying cancer initiation and development. This involves decomposing the frequency of cancer mutations categorized according to their trinucleotide context into a linear combination of known mutational signatures. We formulate this task as an optimization problem with L1 regularization and develop a software tool, sigLASSO, to carry it out efficiently. First, by explicitly adding multinomial sampling into the overall objective function, we jointly optimize the likelihood of sampling and signature fitting. This is especially important when mutation counts are low and sampling variance, high, such as the case in whole exome sequencing. sigLASSO uses L1 regularization to parsimoniously assign signatures to mutation profiles, leading to sparse and more biologically interpretable solutions. Additionally, instead of hard thresholding and choosing a priori, a discrete subset of active signatures, sigLASSO fine-tunes model complexity parameters, informed by the scale of the data and prior knowledge. Finally, it is challenging to evaluate sigLASSO signature assignments. To do this, we construct a set of criteria, which we can apply consistently across assignments.