PT - JOURNAL ARTICLE
AU - Sheth, Suril B.
AU - Sheth, Bhavin R.
TI - A variant of the student’s <em>t</em>-test for data of varying reliability
AID - 10.1101/525774
DP - 2019 Jan 01
TA - bioRxiv
PG - 525774
4099 - http://biorxiv.org/content/early/2019/01/21/525774.short
4100 - http://biorxiv.org/content/early/2019/01/21/525774.full
AB - The student’s t-test has been a workhorse of statistical testing and is used to determine if two sets of sampled data are significantly different from one another, in a statistical sense. The samples of the data may be individual samples or the means – or some overall summary statistic – of independently acquired subsets of data (e.g. data from individual observers, neurons, or baseball games). The various subsets of data acquired that go into computing the t-statistic are likely to be of differing reliability on account of either different variances or of different numbers of subsamples corresponding to each subset; while all data are given equal weight in a standard t-test, the variation in data reliability across subsets of data needs to be accounted for. Solutions based on mixed model methods and Monte Carlo simulations exist, which do factor data reliability in computing statistics. However, no such extension exists for the ubiquitous student’s t-test. Our proposal is a novel variant of the students t-test that incorporates these issues and adopts a simple but effective alteration in the design that accounts for differing levels of data reliability. Specifically, we weighted each data subset by the inverse of the variance of the data contained therein, a measure that has been used in studies of Bayesian cue combination, or, in the absence of information about variance, by the relative proportion of the overall data contained in the subset. The changes proposed here extend the applicability of the student’s t-test to a wider array of data sets.