Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain
the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in
Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles
and JavaScript.
There is no disputing the importance of statistical analysis in biological research, but too often it is considered only after an experiment is completed, when it may be too late.
This collection highlights important statistical issues that biologists should be aware of and provides practical advice to help them improve the rigor of their work.
Nature Methods' Points of Significance column on statistics explains many key statistical and experimental design concepts. Other resources include an online plotting tool and links to statistics guides from other publishers.
Experimental biologists, their reviewers and their publishers must grasp basic statistics, urges David L. Vaux, or sloppy science will continue to grow.
The reliability and reproducibility of science are under scrutiny. However, a major cause of this lack of repeatability is not being considered: the wide sample-to-sample variability in the P value. We explain why P is fickle to discourage the ill-informed practice of interpreting analyses based predominantly on this statistic.
As the data deluge swells, statisticians are evolving from contributors to collaborators. Sallie Ann Keller urges funders, universities and associations to encourage this shift.
Deficiencies in methods reporting in animal experimentation lead to difficulties in reproducing experiments; the authors propose a set of reporting standards to improve scientific communication and study design.
This Review discusses the principles and applications of significance testing and power calculation, including recently proposed gene-based tests for rare variants.
Low-powered studies lead to overestimates of effect size and low reproducibility of results. In this Analysis article, Munafò and colleagues show that the average statistical power of studies in the neurosciences is very low, discuss ethical implications of low-powered studies and provide recommendations to improve research practices.
The authors analyze a large corpus of the neuroscience literature and demonstrate that nearly half of the published studies considered incorrectly compared effect sizes by comparing their significance levels.
Hierarchical models provide reliable statistical estimates for data sets from high-throughput experiments where measurements vastly outnumber experimental samples.
Alkes Price, Peter Visscher and colleagues provide recommendations on the application of mixed-linear-model association methods across a range of study designs.
A protocol providing guidelines on the organizational aspects of genome-wide association meta-analyses and to implement quality control at the study file level, the meta-level across studies, and the meta-analysis output level.
Thomas W Winkler
Felix R Day
The Genetic Investigation of Anthropometric Traits (GIANT) Consortium
This perspective illustrates some of the problems involved in analyzing the complex data yielded by systems neuroscience techniques, such as brain imaging and electrophysiology. Specifically, when test statistics are not independent of the selection criteria, common analyses can produce spurious results. The authors suggest ways to avoid such errors.
The authors examine papers in high profile journals and find that while collection of multiple observations from a single research object is common practice, such nested data are often analyzed using inappropriate statistical techniques. The authors show that this results in increased Type I error rates, and propose multilevel modelling to address this issue.
When prioritizing hits from a high-throughput experiment, it is important to correct for random events that falsely appear significant. How is this done and what methods should be used?
Statistical models called hidden Markov models are a recurring theme in computational biology. What are hidden Markov models, and why are they so useful for so many different problems?
You can look back there to explain things, but the explanation disappears. You’ll never find it there. Things are not explained by the past. They’re explained by what happens now. –Alan Watts
“Every day sadder and sadder news of its increase. In the City died this week 7496; and of them, 6102 of the plague. But it is feared that the true number of the dead this week is near 10,000 ....” —Samuel Pepys, 1665
“I have no idea what’s awaiting me, or what will happen when this all ends. For the moment I know this: there are sick people and they need curing.” ―Albert Camus, The Plague
Nature uses only the longest threads to weave her patterns, so that each small piece of her fabric reveals the organization of the entire tapestry. —Richard Feynman
It is the mark of an educated mind to rest satisfied with the degree of precision that the nature of the subject admits and not to seek exactness where only an approximation is possible. —Aristotle