Abstract
In this article we use gene expression measurements from blood samples to predict breast cancer metastasis. We compare several predictive models and propose a biologically motivated variable selection scheme. Curve selection is based on the assumption that gene expression intensity as a function of time should diverge between cases and controls: there should be a larger difference between case and control closer to diagnosis than years before. We obtain better predictions and more stable predictive signatures by using curve selection and show some evidence that metastasis can be detected in blood samples.
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.