Skip to main content

Statistical Issues in cDNA Microarray Data Analysis

  • Protocol

Part of the book series: Methods in Molecular Biology ((MIMB,volume 224))

Abstractt

Statistical considerations are frequently to the fore in the analysis of microarray data, as researchers sift through massive amounts of data and adjust for various sources of variability in order to identify the important genes among the many that are measured. This chapter summarizes some of the issues involved and provides a brief review of the analysis tools that are available to researchers to deal with these issues.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Kerr, M. K. and Churchill, G. A. (2001) Experimental design for gene expression microarrays. Biostatistics 2, 183–201.

    Article  PubMed  Google Scholar 

  2. Glonek, G. F. V. and Solomon, P. J. (2002) Factorial designs for microarray experiments. Technical Report, Department of Applied Mathematics, University of Adelaide, Australia.

    Google Scholar 

  3. Pan, W., Lin, J., and Le, C. (2002) How many replicates of arrays are required to detect gene expression changes in microarray experiments? A mixture model approach. Genome Biol. 3(5), research0022.1–0022.10.

    Google Scholar 

  4. Speed, T. P. and Yang, Y. H. (2002) Direct versus indirect designs for cDNA microarray experiments. Technical Report 616, Department of Statistics, University of California, Berkeley.

    Google Scholar 

  5. Alizadeh, A. A, Eisen, M. B., Davis, R. E., et al. (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769), 503–511.

    Article  PubMed  CAS  Google Scholar 

  6. Chen, Y., Dougherty, E. R., and Bittner, M. L. (1997) Ratio based decisions and the quantitative analysis of cDNA microarray images. J. Biomed. Opt. 2, 364–374.

    Article  CAS  Google Scholar 

  7. QuantArray Analysis Software. http://lifesciences.perkinelmer.com.

  8. Scanalytics MicroArray Suite. http://www.scanalytics.com.

  9. GenePix Pro microarray and array analysis software, Axon Instruments http://www.axon.com.

  10. Buhler, J., Ideker, T., and Haynor, D. (2000) Dapple: improved techniques for finding spots on DNA microarrays. CSE Technical Report UWTR 2000-08-05, University of Washington.

    Google Scholar 

  11. Beucher, S. and Meyer, F. (1993) The morphological approach to segmentation: the watershed transformation: mathematical morphology in image processing. Opt. Eng. 34, 433–481.

    Google Scholar 

  12. Adams, R. and Bischof, L. (1994) Seeded region growing. IEEE Trans. Pattern Anal. Machine Intelligence 16, 641–647.

    Article  Google Scholar 

  13. Buckley, M. J. (2000) Spot User’s Guide, CSIRO Mathematical and Information Sciences, Sydney, Australia. http://www.cmis.csiro.au/iap/Spot/spotmanual.htm.

  14. Wang, X., Ghosh, S., and Guo, S.-W. (2001) Quantitative quality control in microarray image processing and data acquisition. Nucleic Acids Res. 29(15), E75–5.

    Article  PubMed  CAS  Google Scholar 

  15. Eisen, M. B. (1999) ScanAlyze User Manual, Stanford University, Palo Alto. http://rana.lbl.gov.

    Google Scholar 

  16. ArrayVision, Imaging Research. http://imaging.brocku.ca.

  17. Soille, P. (1999) Morphological Image Analysis: Principles and Applications, Springer, New York.

    Google Scholar 

  18. Yang, Y. H., Buckley, M. J., Dudoit, S., and Speed, T. P. (2002) Comparison of methods for image analysis on cDNA microarray data. J. Computat. Graph. Stat. 11, 108–136.

    Article  Google Scholar 

  19. Kooperberg, C., Fazzio, T. G., Delrow, J. J., and Tsukiyama, T. (2002) Improved background correction for spotted cDNA microarrays. J. Computat. Biol. 9, 55–66.

    Article  CAS  Google Scholar 

  20. Dudoit, S., Yang, Y. H., Speed, T. P., and Callow, M. J. (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 12, 111–140.

    Google Scholar 

  21. Kerr, M. K., Martin, M., and Churchill, G. A. (2000) Analysis of variance for gene expression microarray data. J. Computat. Biol. 7, 819–837.

    Article  CAS  Google Scholar 

  22. Wolfinger, R. D., Gibson, G., Wolfinger, E. D., Bennett, L., Hamadeh, H., Bushel, P., Afshari, C., and Paules, R. S. (2001) Assessing gene significance from cDNA microarray expression data via mixed models. J. Computat. Biol. 8, 625–637.

    Article  CAS  Google Scholar 

  23. Yang, Y. H., Dudoit, S., Luu, P., and Speed, T. P. (2001) Normalization for cDNA microarray data, in Microarrays: Optical Technologies and Informatics (Bittner, M. L. Chen, Y. Dorsel, A. N. and Dougherty, E. R., eds.), Proceedings of SPIE, vol. 4266.

    Google Scholar 

  24. Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., and Speed, T. P. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 30(4), E15.

    Article  PubMed  Google Scholar 

  25. Finkelstein, D. B., Gollub, J., Ewing, R., Sterky, F., Somerville, S., and Cherry, J. M. (2001) Iterative linear regression by sector, in Methods of Microarray Data Analysis. Papers from CAMDA 2000. (Lin S. M. and Johnson, K. F., eds.) Kluwer Academic, pp. 57–68.

    Google Scholar 

  26. Kepler, T. B., Crosby, L., and Morgan, K. T. (2000) Normalization and analysis of DNA microarray data by self-consistency and local regression, Santa Fe Institute Working Paper, Santa Fe, NM.

    Google Scholar 

  27. Schadt, E. E., Li, C., Ellis, B., and Wong, W. H. (2002) Feature extraction and normalization algorithms for high-density oligonucleotide gene expression array data. J. Cell. Biochem. 84(Suppl. 37), 120–125.

    Google Scholar 

  28. Tseng, G. C., Oh, M.-K., Rohlin, L., Liao, J. C., and Wong, W. H. (2001) Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res. 29, 2549–2557.

    Article  PubMed  CAS  Google Scholar 

  29. Brown, C. S., Goodwin, P. C., and Sorger, P. K. (2000) Image metrics in the statistical analysis of DNA microarray data. Proc. Natl. Acad. Sci. USA 98, 8944–8949.

    Article  Google Scholar 

  30. Yang, M. C., Ruan, Q.-G., Yang, J. J., Eckenrode, S., Wu, S., McIndoe, R. A., and She, J.-X. (2001) A statistical procedure for flagging weak spots greatly improves normalization and ratio estimates in microarray experiments. Physiol. Genomics 7, 45–53.

    PubMed  CAS  Google Scholar 

  31. Nadon, R., Shi, P., Skandalis, A., Woody, E., Hubschle, H., Susko, E., Rghei, N., and Ramm, P. (2001) Statistical methods for gene expression arrays, in Microarrays: Optical Technologies and Informatics Proceedings of SPIE, vol. 4266, (Bittner, M. L., Chen, Y., Dorsel, A. N., and Dougherty, E. R. eds.), pp. 46–55.

    Google Scholar 

  32. Tusher, V., Tibshirani, R., and Chu, G. (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA 98, 5116–5124.

    Article  PubMed  CAS  Google Scholar 

  33. Lönnstedt, I. and Speed, T. P. (2002) Replicated microarray data. Statistica Sinica 12, 31–46.

    Google Scholar 

  34. Efron B., Tibshirani, R., Storey J. D., and Tusher V. (2001) Empirical Bayes analysis of a microarray experiment. J. Am. Stat. Assoc. 96, 1151–1160.

    Article  Google Scholar 

  35. Lin, D. M., Yang, Y. H., Scolnick, J. A., Brunet, L. J., Peng, V., Speed, T. P., and Ngai, J. (2002) A spatial map of gene expression in the olfactory bulb, Department of Molecular and Cell Biology, University of California, Berkeley.

    Google Scholar 

  36. Lönnstedt, I., Grant, S., Begley, G., and Speed, T. P. (2001) Microarray analysis of two interacting treatments: a linear model and trends in expression over time. Technical Report, Department of Mathematics, Uppsala University, Sweden.

    Google Scholar 

  37. Huber, P. J. (1981) Robust Statistics, Wiley, New York.

    Book  Google Scholar 

  38. Marazzi, A. (1993) Algorithms, Routines and S Functions for Robust Statistics, Wadsworth & Brooks/Cole, CA.

    Google Scholar 

  39. Shaffer, J. P. (1995) Multiple hypothesis testing. Annu. Rev. Psychol. 46, 561–576.

    Article  Google Scholar 

  40. Westfall, P. H. and Young, S. S. (1993) Re-Sampling Based Multiple Testing, Wiley, New York.

    Google Scholar 

  41. Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc. Ser. 57, 289–300.

    Google Scholar 

  42. Storey, J. D. and Tibshirani, R. (2001) Estimating false discovery rates under dependence with applications to DNA microarrays, Technical Report, Department of Statistics, Stanford University.

    Google Scholar 

  43. Ideker, T., Thorsson, V., Siegel, A. F., and Hood, L. (2000) Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. J. Computat. Biol. 7(6), 805–817.

    Article  CAS  Google Scholar 

  44. Newton, M. A., Kenziorski, C. M., Richmond, C. S., Blattner, F. R., and Tsui, K. W. (2001) On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data. J. Computat. Biol. 8, 37–52.

    Article  CAS  Google Scholar 

  45. Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D., and Lander, E. S. (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537.

    Article  PubMed  CAS  Google Scholar 

  46. Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979) Multivariate Analysis, Academic, London.

    Google Scholar 

  47. McLachlan, G. J. (1992) Discriminant Analysis and Statistical Pattern Recognition, Wiley, New York.

    Book  Google Scholar 

  48. Riply, B. D. (1996) Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge.

    Google Scholar 

  49. Breiman, L., Friedman, J. H., Olsen, R. A., and Stone, C. J. (1984) Classification and Regression Trees, Wadsworth, Monterey, CA.

    Google Scholar 

  50. Breiman, L. (1996) Bagging predictors. Machine Learning 24, 123–140.

    Google Scholar 

  51. Breiman, L. (1998) Arcing classifiers. Ann. Stat. 26, 801–824.

    Article  Google Scholar 

  52. Brown, M. P., Grundy, W. N., Lin, D., Cristianini, N., Sugnet, C. W., Furey, T. S., Ares, M. Jr., and Haussler, D. (2000) Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. USA 97, 262–267.

    Article  PubMed  CAS  Google Scholar 

  53. Quackenbush, J. (2001) Computational analysis of microarray data. Nat. Rev. Genet. 2, 418–427.

    Article  PubMed  CAS  Google Scholar 

  54. Dudoit, S., Fridlyand, J., and Speed, T. P. (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc. 97, 77–87.

    Article  CAS  Google Scholar 

  55. Eisen, M. B., Spellman, P. T., Brown, P. O., Botstein, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14,863–14,868.

    Article  PubMed  CAS  Google Scholar 

  56. Hastie, T., Tibshirani, R., Eisen, M. B., Alizadeh, A., Levy, R., Staudt, L., Chan, W. C., Botstein, D., and Brown, P. (2000) “Gene shaving” as a method for identifying distinct sets of genes with similar expression patterns. Genome Biol. 1(2), 0003.1–0003.21.

    Article  Google Scholar 

  57. Lazzeroni, L. and Owen, A. B. (2002) Plaid models for gene expression data. Statistica Sinica 12, 61–86.

    Google Scholar 

  58. Parmigiani, G., Garrett, E. S., Anbazhagan, R., and Gabrielson, E. (2002) A statistical framework for expression-based molecular classification in cancer, Technical Report, Department of Biostatistics, Johns Hopkins University.

    Google Scholar 

  59. Dudoit, S., Yang, Y. H., and Bolstad, B. (2002) Using R for the analysis of DNA microarray data. R News 2(1), 24–32.

    Google Scholar 

  60. Dudoit, S. and Yang, Y. H. (2003) Bioconductor R packages for exploratory analysis and normalization of cDNA microarray data, in The Analysis of Gene Expression Data: Methods and Software (Parmigiani, G., Garrett, E. S., Irizarry, R. A., and Zeger, S. L., eds.), Springer, New York, in press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Humana Press Inc.

About this protocol

Cite this protocol

Smyth, G.K., Yang, Y.H., Speed, T. (2003). Statistical Issues in cDNA Microarray Data Analysis. In: Brownstein, M.J., Khodursky, A.B. (eds) Functional Genomics. Methods in Molecular Biology, vol 224. Humana Press. https://doi.org/10.1385/1-59259-364-X:111

Download citation

  • DOI: https://doi.org/10.1385/1-59259-364-X:111

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-58829-291-9

  • Online ISBN: 978-1-59259-364-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics